(Quick note for those who don't know: the Singularity Institute for AI is not affiliated with Singularity University, though there are some overlaps ... Ray Kurzweil is an Advisor to the former and the founder of the latter; and I am an Advisor to both.)

Following that discussion, a bunch of people have emailed me in the last couple weeks asking me to write something clearly and specifically addressing my views on SIAI's perspective on the future of AI. I don't want to spend a lot of time on this but I decided to bow to popular demand and write a blog post...

Of course, there are a lot of perspectives in the world that I don't agree with, and I don't intend to write blog posts explaining the reasons for my disagreement with all of them! But since I've had some involvement with SIAI in the past, I guess it's sort of a special case.

First of all I want to clarify I'm not in disagreement with the existence of SIAI as an institution, nor with the majority of their activities -- only with certain positions habitually held by some SIAI researchers, and by the community of individuals heavily involved with SIAI. And specifically with a particular line of thinking that I'll refer to here as "SIAI's Scary Idea."

Roughly, the Scary Idea posits that: If I or anybody else actively trying to build advanced AGI succeeds, we're highly likely to cause an involuntary end to the human race.Brief Digression: My History with SIAI

Before getting started with the meat of the post, I'll give a few more personal comments, to fill in some history for those readers who don't know it, or who know only parts. Readers who are easily bored may wish to skip to the next section,

SIAI has been quite good to me, overall. I've enjoyed all the Singularity Summits, which they've hosted, very much; I think they've played a major role in the advancement of society's thinking about the future, and I've felt privileged to speak at them. And I applaud SIAI for consistently being open to Summit speakers whose views are strongly divergent from those commonly held in the SIAI community.

Also, in 2008, SIAI and my company Novamente LLC seed-funded the OpenCog open-source AGI project (based on software code spun out from Novamente). The SIAI/OpenCog relationship diminished substantially when Tyler Emerson passed the leadership of SIAI along to Michael Vassar, but it was instrumental in getting OpenCog off the ground. I've also enjoyed working with Michael Vassar on the Board of Humanity+, of which I'm Chair and he's a Board member.

When SIAI was helping fund OpenCog, I took the title of "Director of Research" of SIAI, but I never actually directed any research there apart from OpenCog. The other SIAI research was always directed by others, which was fine with me. There were occasional discussions about operating in a more unified manner, but it didn't happen. All this is perfectly ordinary in a small start-up type organization.

Once SIAI decided OpenCog was no longer within its focus, after a bit of delay I decided it didn't make sense for me to hold the Director of Research title anymore, since as things were evolving, I wasn't directing any SIAI research. I remain as an Advisor to SIAI, which is going great.

Now, on to the meat of the post….

SIAI's Scary Idea (Which I Don't Agree With)

SIAI's leaders and community members have a lot of beliefs and opinions, many of which I share and many not, but the key difference between our perspectives lies in what I'll call SIAI's "Scary Idea", which is the idea that: progressing toward advanced AGI without a design for "provably non-dangerous AGI" (or something closely analogous, often called "Friendly AI" in SIAI lingo) is highly likely to lead to an involuntary end for the human race.

(SIAI's Scary Idea has been worded in many different ways by many different people, and I tried in the above paragraph to word it in a way that captures the idea fairly if approximatively, and won't piss off too many people.)

Of course it's rarely clarified what "provably" really means. A mathematical proof can only be applied to the real world in the context of some assumptions, so maybe "provably non-dangerous AGI" means "an AGI whose safety is implied by mathematical arguments together with assumptions that are believed reasonable by some responsible party"? (where the responsible party is perhaps "the overwhelming majority of scientists" … or SIAI itself?)….. I'll say a little more about this a bit below.

Please note that, although I don't agree with the Scary Idea, I do agree that the development of advanced AGI has significant risks associated with it. There are also dramatic potential benefits associated with it, including the potential of protection against risks from other technologies (like nanotech, biotech, narrow AI, etc.). So the development of AGI has difficult cost-benefit balances associated with it -- just like the development of many other technologies.

I also agree with Nick Bostrom and a host of SF writers and many others that AGI is a potential "existential risk" -- i.e. that in the worst case, AGI could wipe out humanity entirely. I think nanotech and biotech and narrow AI could also do so, along with a bunch of other things.

I certainly don't want to see the human race wiped out! I personally would like to transcend the legacy human condition and become a transhuman superbeing … and I would like everyone else to have the chance to do so, if they want to. But even though I think this kind of transcendence will be possible, and will be desirable to many, I wouldn't like to see anyone forced to transcend in this way. I would like to see the good old fashioned human race continue, if there are humans who want to maintain their good old fashioned humanity, even if other options are available

But SIAI's Scary Idea goes way beyond the mere statement that there are risks as well as benefits associated with advanced AGI, and that AGI is a potential existential risk.

Finally, I note that most of the other knowledgeable futurist scientists and philosophers, who have come into close contact with SIAI's perspective, also don't accept the Scary Idea. Examples include Robin Hanson, Nick Bostrom and Ray Kurzweil.

There's nothing wrong with having radical ideas that one's respected peers mostly don't accept. I totally get that: My own approach to AGI is somewhat radical, and most of my friends in the AGI research community, while they respect my work and see its potential, aren't quite as enthused about it as I am. Radical positive changes are often brought about by people who clearly understand certain radical ideas well before anyone else "sees the light." However, my own radical ideas are not telling whole research fields that if they succeed they're bound to kill everybody ... so it's a somewhat different situation.

What is the Argument for the Scary Idea?

Although an intense interest in rationalism is one of the hallmarks of the SIAI community, still I have not yet seen a clear logical argument for the Scary Idea laid out anywhere. (If I'm wrong, please send me the link, and I'll revise this post accordingly. Be aware that I've already at least skimmed everything Eliezer Yudkowsky has written on related topics.)

So if one wants a clear argument for the Scary Idea, one basically has to construct it oneself.

As far as I can tell from discussions and the available online material, some main ingredients of peoples' reasons for believing the Scary Idea are ideas like:

If one pulled a random mind from the space of all possible minds, the odds of it being friendly to humans (as opposed to, e.g., utterly ignoring us, and being willing to repurpose our molecules for its own ends) are very low

Human value is fragile as well as complex, so if you create an AGI with a roughly-human-like value system, then this may not be good enough, and it is likely to rapidly diverge into something with little or no respect for human values

"Hard takeoffs" (in which AGIs recursively self-improve and massively increase their intelligence) are fairly likely once AGI reaches a certain level of intelligence; and humans will have little hope of stopping these events

A hard takeoff, unless it starts from an AGI designed in a "provably Friendly" way, is highly likely to lead to an AGI system that doesn't respect the rights of humans to exist

I emphasize that I am not quoting any particular thinker associated with SIAI here. I'm merely summarizing, in my own words, ideas that I've heard and read very often from various individuals associated with SIAI.

If you put the above points all together, you come up with a heuristic argument for the Scary Idea. Roughly, the argument goes something like: If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.

The line of argument makes sense, if you accept the premises.

But, I don't.

I think the first of the above points is reasonably plausible, though I'm not by any means convinced. I think the relation between breadth of intelligence and depth of empathy is a subtle issue which none of us fully understands (yet). It's possible that with sufficient real-world intelligence tends to come a sense of connectedness with the universe that militates against squashing other sentiences. But I'm not terribly certain of this, any more than I'm terribly certain of its opposite.

I agree much less with the final three points listed above. And I haven't seen any careful logical arguments for these points.

I doubt human value is particularly fragile. Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology. I think it's fairly robust.

I think a hard takeoff is possible, though I don't know how to estimate the odds of one occurring with any high confidence. I think it's very unlikely to occur until we have an AGI system that has very obviously demonstrated general intelligence at the level of a highly intelligent human. And I think the path to this "hard takeoff enabling" level of general intelligence is going to be somewhat gradual, not extremely sudden.

I don't have any strong sense of the probability of a hard takeoff, from an apparently but not provably human-friendly AGI, leading to an outcome likable to humans. I suspect this probability depends on many features of the AGI, which we will identify over the next years & decades via theorizing based on the results of experimentation with early-stage AGIs.

Yes, you may argue: the Scary Idea hasn't been rigorously shown to be true… but what if it IS true?

OK but ... pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

The Scary Idea is certainly something to keep in mind, but there are also many other risks to keep in mind, some much more definite and palpable. Personally, I'm a lot more worried about nasty humans taking early-stage AGIs and using them for massive destruction, than about speculative risks associated with little-understood events like hard takeoffs.

Is Provably Safe or "Friendly" AGI A Feasible Idea?

The Scary Idea posits that if someone creates advanced AGI that isn't somehow provably safe, it's almost sure to kill us all.

But not only am I unconvinced of this, I'm also quite unconvinced that "provably safe" AGI is even feasible.

The idea of provably safe AGI is typically presented as something that would exist within mathematical computation theory or some variant thereof. So that's one obvious limitation of the idea: mathematical computers don't exist in the real world, and real-world physical computers must be interpreted in terms of the laws of physics, and humans' best understanding of the "laws" of physics seems to radically change from time to time. So even if there were a design for provably safe real-world AGI, based on current physics, the relevance of the proof might go out the window when physics next gets revised.

Also, there are always possibilities like: the alien race that is watching us and waiting for us to achieve an IQ of 333, at which point it will swoop down upon us and eat us, or merge with us. We can't rule this out via any formal proof, and we can't meaningfully estimate the odds of it either. Yes, this sounds science-fictional and outlandish; but is it really more outlandish and speculative than the Scary Idea?

A possibility that strikes me as highly likely is that, once we have created advanced AGI and have linked our brains with it collectively, most of our old legacy human ideas (including physical law, aliens, and Friendly AI) will seem extremely limited and ridiculous.

Another issue is that the goal of "Friendliness to humans" or "safety" or whatever you want to call it, is rather nebulous and difficult to pin down. Science fiction has explored this theme extensively. So even if we could prove something about "smart AGI systems with a certain architecture that are guaranteed to achieve goal G," it might be infeasible to apply this to make AGI systems that are safe in the real-world -- simply because we don't know how to boil down the everyday intuitive notions of "safety" or "Friendliness" into a mathematically precise goal G like the proof refers to.

This is related to the point Eliezer Yudkowsky makes that "value is complex" -- actually, human value is not only complex, it's nebulous and fuzzy and ever-shifting, and humans largely grok it by implicit procedural, empathic and episodic knowledge rather than explicit declarative or linguistic knowledge. Transmitting human values to an AGI is likely to be best done via interacting with the AGI in real life, but this is not the sort of process that readily lends itself to guarantees or formalization.

Eliezer has suggested a speculative way of getting human values into AGI systems called Coherent Extrapolated Volition, but I think this is a very science-fictional and incredibly infeasible idea (though a great SF notion). I've discussed it and proposed some possibly more realistic alternatives in a previous blog post (e.g. a notion called Coherent Aggregated Volition). But my proposed alternatives aren't guaranteed-to-succeed nor neatly formalized.

But setting those worries aside, is the computation-theoretic version of provably safe AI even possible? Could one design an AGI system and prove in advance that, given certain reasonable assumptions about physics and its environment, it would never veer too far from its initial goal (e.g. a formalized version of the goal of treating humans safely, or whatever)?

I very much doubt one can do so, except via designing a fictitious AGI that can't really be implemented because it uses infeasibly much computational resources. My GOLEM design, sketched in this article, seems to me a possible path to a provably safe AGI -- but it's too computationally wasteful to be practically feasible.

I strongly suspect that to achieve high levels of general intelligence using realistically limited computational resources, one is going to need to build systems with a nontrivial degree of fundamental unpredictability to them. This is what neuroscience suggests, it's what my concrete AGI design work suggests, and it's what my theoretical work on GOLEM and related ideas suggests. And none of the public output of SIAI researchers or enthusiasts has given me any reason to believe otherwise, yet.

Practical Implications

The above discussion of SIAI's Scary Idea may just sound like fun science-fictional speculation -- but the reason I'm writing this blog post is that when I posted a recent blog post about my current AGI project, the comments field got swamped with SIAI-influenced people saying stuff in the vein of: Creating an AGI without a proof of Friendliness is essentially equivalent to killing all people! So I really hope your OpenCog work fails, so you don't kill everybody!!!

(One amusing/alarming quote from a commentator (probably not someone directly affiliated with SIAI) was "if you go ahead with an AGI when you're not 100% sure that it's safe, you're committing the Holocaust." But it wasn't just one extreme commentator, it was a bunch … and then a bunch of others commenting to me privately via email.)

If one fully accepts SIAI's Scary Idea, then one should not work on practical AGI projects, nor should one publish papers on the theory of how to build AGI systems. Instead, one should spend one's time trying to figure out an AGI design that is somehow provable-in-advance to be a Good Guy. For this reason, SIAI's research group is not currently trying to do any practical AGI work.

Actually, so far as I know, my "GOLEM" AGI design (mentioned above) is closer to a "provably Friendly AI" than anything the SIAI research team has come up with. At least, it's closer than anything they have made public.

However GOLEM is not something that could be practically implemented in the near future. It's horribly computationally inefficient, compared to a real-world AGI design like the OpenCog system I'm now working on (with many others -- actually I'm doing very little programming these days, so happily the project is moving forward with the help of others on the software design and coding side, while I contribute at the algorithm, math, design, theory, management and fundraising levels).

I agree that AGI ethics is a Very Important Problem. But I doubt the problem is most effectively addressed by theory alone. I think the way to come to a useful real-world understanding of AGI ethics is going to be to

study these early-stage AGI systems empirically, with a focus on their ethics as well as their cognition

in the usual manner of science, attempt to arrive at a solid theory of AGI intelligence and ethics based on a combination of conceptual and experimental-data considerations

humanity collectively plots the next steps from there, based on the theory we find: maybe we go ahead and create a superhuman AI capable of hard takeoff, maybe we pause AGI development because of the risks, maybe we build an "AGI Nanny" to watch over the human race and prevent AGI or other technologies from going awry. Whatever choice we make then, it will be made based on far better knowledge than we have right now.

So what's wrong with this approach?

Nothing, really -- if you hold the views of most AI researchers or futurists. There are plenty of disagreements about the right path to AGI, but wide and implicit agreement that something like the above path is sensible.

But, if you adhere to SIAI's Scary Idea, there's a big problem with this approach -- because, according to the Scary Idea, there's too huge of a risk that these early-stage AGI systems are going to experience a hard takeoff and self-modify into something that will destroy us all.

But I just don't buy the Scary Idea.

I do see a real risk that, if we proceed in the manner I'm advocating, some nasty people will take the early-stage AGIs and either use them for bad ends, or proceed to hastily create a superhuman AGI that then does bad things of its own volition. These are real risks that must be thought about hard, and protected against as necessary. But they are different from the Scary Idea. And they are not so different from the risks implicit in a host of other advanced technologies.

Conclusion

So, there we go.

I think SIAI is performing a useful service by helping bring these sorts of ideas to the attention of the futurist community (alongside the other services they're performing, like the wonderful Singularity Summits). But, that said, I think the Scary Idea is potentially a harmful one. At least, it WOULD be a harmful one, if more people believed it; so I'm glad it's currently restricted to a rather small subset of the futurist community.

Many people die each day, and many others are miserable for various reasons -- and all sorts of other advanced and potentially dangerous technologies are currently under active development. My own view is that unaided human minds may well be unable to deal with the complexity and risk of the world that human technology is unleashing. I actually suspect that our best hope for survival and growth through the 21st century is to create advanced AGIs to help us on our way -- to cure disease, to develop nanotech and better AGI and invent new technologies; and to help us keep nasty people from doing destructive things with advanced technology.

I think that to avoid actively developing AGI, out of speculative concerns like the Scary Idea, would be an extremely bad idea.

That is, rather than "if you go ahead with an AGI when you're not 100% sure that it's safe, you're committing the Holocaust," I suppose my view is closer to "if you avoid creating beneficial AGI because of speculative concerns, then you're killing my grandma" !! (Because advanced AGI will surely be able to help us cure human diseases and vastly extend and improve human life.)

So perhaps I could adopt the slogan: "You don't have to kill my grandma to avoid the Holocaust!" … but really, folks… Well, you get the point….

Humanity is on a risky course altogether, but no matter what I decide to do with my life and career (and no matter what Bill Joy or Jaron Lanier or Bill McKibben, etc., write), the race is not going to voluntarily halt technological progress. It's just not happening.

We just need to accept the risk, embrace the thrill of the amazing time we were born into, and try our best to develop near-inevitable technologies like AGI in a responsible and ethical way.

And to me, responsible AGI development doesn't mean fixating on speculative possible dangers and halting development until ill-defined, likely-unsolvable theoretical/philosophical issues are worked out to everybody's (or some elite group's) satisfaction.

Rather, it means proceeding with the work carefully and openly, learning what we can as we move along -- and letting experiment and theory grow together ... as they have been doing quite successfully for the last few centuries, at a fantastically accelerating pace.

In my view, an AI singularity would be akin to the one taking place when humans became sentient. From humanity's simple desire to install itself permanently and exclusively at the top of the food chain, all sorts of interesting developments followed. The most important being that we are hostile and lethal to every macroscopic lifeform that is not (cute || (edible && farmable)). It may have taken 200000 years for us to facilitate the takeoff (give or take a magnitude), but eventually, we will have ruined the original biosphere.

Simply put, the proponents of friendly AI seem to think that we are extremely cute (since edibility is out).

From talking to SIAI folks I got the impression that they for the most part have doubts in our cuteness, which you might not share. They are also aware that the speed of the takeoff is not important, nor the conditions of the creation of a subset of the AGIs that would conform to "provable friendliness".

Provable friendliness, by the way, would have to entail that the AGI does not reach certain parameters (with respect to resources, capacities, goals etc.). It could very well turn out that interesting AI can not be created with guaranteed friendliness.

Thus, what it will probably come down to, is some kind of "Turing Police" (as suggested in Neuromancer). During coffee breaks at the ECAP conference in Munich, I heard a couple of interesting pitches on how such a policing might be achieved (for instance, by uploading and transsuperhumanizing some extremely trustworthy human fellow first, so he can act like Larry Niven's Pak Protector for us).

Sure, it is probably not going to work, and a malevolent AI singularity is not certain, but if you put a value of minus infinity on the realization of a true existential risk, all sorts of things suddenly seem worthwile. And, you know, the SIAI guys tend to be Bayesian...

However, there are more realistic ways of preventing an AI Singularity. For instance, continuing to focus AI funding the way they do now.

When I first really encountered the idea of the Singularity 10 years ago, I was at first awed, then terrified. I guess I arrived at the "scary idea" by my own reaction. But in the decade since, I have grown more and more aware of my own cognitive limitations, and I am now almost fully convinced that AI/transhumanism is the best hope for everything and everyone, even those who choose NOT to take part in it. Reading your post helped convince me even more. Keep up the good work!

To make a subtle but important point: SIAI's goal is not to build an AGI that everyone (or even just SIAI!) believes is safe, under a proof who's assumptions are felt to be reasonable by any group of people. The goal of SIAI (and every other AGI researcher for that matter) is to make an AGI that is safe. That is the criterion of success. If everyone is convinced that an AGI design is safe, and it isn't, all the AGI designers have failed.

A central point of SIAI that makes it unique is that there are a number of cognitive biases that strike when people think about AI. One of the big ones is anthropomorphization. Humans are evolved to deal with other humans, and we tend to assume other things are like humans, until we gain deeper understanding. Just like when the Greeks thought about thunder and the seasons, and when many early biologists thought about evolution, when people think about AI we tend to assume the phenomenon has a mind like us. Of course being a mind, AGI is the most subtle instance of anthropomorphization. The danger of this lies in assuming that any arbitrary optimization process is going to automatically have the values of social, plains-dwelling apes, values which we have for very distinct, evolutionary purposes.

Another cognitive bias is that it is easy for us to imagine our own grandmothers dying, death being a terrible thing that nobody should have to go through. We have the experience of seeing people grow old and die. (I quite agree with your point in the conclusion that thousands are dying all the time, and that this is terrible, but of course billions is even worse.) It's also not too hard to imagine the headlines as someone else develops AGI, becoming the reknowned creator of perhaps the most pivotal invention of our entire history. It is a much weirder thing to imagine pushing a button and being the cause of death of 6 billion people, because it didn't turn out a design was safe, and it seemed like worth giving it a shot. Killing a grandmother leaps much more readily to the imagination than killing everyone's grandmothers.

As for point one (that a random mind will not likely be friendly to humans), there are several elements to that claim, but much of it is described in Steven Omohundro's paper, "Basic AI Drives" (http://selfawaresystems.com/2007/11/30/paper-on-the-basic-ai-drives/).

As for point two, Eliezer's "Fun Theory Sequence" is probably the best reading on the subject (http://lesswrong.com/lw/xy/the_fun_theory_sequence/).

Point three isn't actually critical. The only important thing is that the first AI to reach a certain level be Friendly. The level I'm referring to is the level at which an AI has enough power to halt all other AI projects, in order to stop them from hampering its goals. However, the harder the takeoff, the harder it is to see unFriendliness coming. Note that any unFriendly AI will seek to act like it's Friendly, once it is smart enough to model people fairly well, but before it can just stop people from doing anything about it.

As for point 4, see the Omohundro paper. Unless the AI is designed to care at least somewhat about human existence (defining "human" and "life" to the AI is critical here), it won't care in the slightest (this is a tautology, ie, unavoidably true). We would then be like horses to the AI, except that humans care about horses at least a little. And the AI has nanotech. A look at our concern for other species is a good indication of the actions of a much more powerful, uncaring intelligence. Note also that much of human environmentalism is based off the understanding that WE suffer if there is wide-spread extinction. And again, an AI with nanotech doesn't face this problem. And then even if it did care about human life, does it care that we are "happy"? Or free to move about, or to make our own choices, or that we don't just have wires in our brains and sit there in medicated, blissful stupor for thousands of years? This is a lot to ask of any arbitrary mind. The fact that we don't know how to specify these things to an AI just means that we can't yet tell an AI to do these things.

Point five, I agree that these are mostly accurate and fair summaries of some key ideas of SIAI.

In your critique of value as fragile, you seem to be using the word to mean "static". when SIAI talks about value as fragile, they mean that it is complicated, has many parts (that people assume but seldom realize), and that the lack of even one of those parts would result in the loss of much of what we care about. Like a perfect world without boredom, the AI causing us to relive one really great experience over and over, millions of times. Not completely without value, much better than everyone dying, but kind of a shame of a future, and that's in the case of getting things MOSTLY right.

You imply that a reader may say that even though some claims (like these ideas of SIAI) aren't proven, they COULD still happen. Any such reader is using some very silly reasoning. The members of SIAI are Bayesians. They care very, very much about probabilities.

As for Friendly AI being feasible:

Just because you can't implement mathematical computers in the real world has nothing to do with mathematical theorems being true. Even if physics did change, a mathematical theorem of Friendliness would remain the same.

Regarding applications, are you saying that we shouldn't trust any application, or any of our theories about the world, because physics could change at any moment? For one thing, do you have such a concern for other domains, like whether it's worth trying to go to the grocery store to get food? At any moment our entire understanding might change. No use for theories then!

Secondly, we have no reason to expect (ie, put significant probability on) truly radical changes in physics. New discoveries in physics add subtlety and detail, they change the underlying model, but Newton's laws are still just as good approximations now as they were before relativity or quantum mechanics.

As for possibilities like aliens waiting for us to achieve an IQ of 333, such arbitrary goal systems would be incredibly rare in the universe. What kind of well-formed utility function is optimized by a move like that? Prediction should not be done by randomly judging arbitrary hypotheses, or by how "outlandish something sounds," but by detailed thinking and logical argument. By the way, calling Friendly AI Theory "the Scary Idea" is somewhat patronizing, and sounds more like political than technical debate.

Unless the AI is designed to care at least somewhat about human existence (defining "human" and "life" to the AI is critical here), it won't care in the slightest (this is a tautology, ie, unavoidably true).

This is certainly NOT a tautology. Omohundro's paper gives heuristic arguments not proofs.

It is a logical possibility that compassion toward humans would tend to emerge in intelligent systems as a consequence of their interaction with the world. Omuhundro's arguments are quite generic and don't take into account many of the specific properties of the world the AI would grow up in.

However, none of us building AGIs plan to create systems with no care for human life. We plan to teach our systems to treat humans well, via a combination of methods. We don't plan to build "random minds."

So it's an irrelevant point anyway.

The danger of this lies in assuming that any arbitrary optimization process is going to automatically have the values of social, plains-dwelling apes, values which we have for very distinct, evolutionary purposes.

I don't believe that serious AGI researchers are commonly making this error. We are architecting systems with care and plan to teach them with care; we're not counting on such things happening "automatically."

By the way, calling Friendly AI Theory "the Scary Idea" is somewhat patronizing, and sounds more like political than technical debate.

Hmmm, I think it sounds precisely as serious as the term "Friendly AI Theory" !!!! Eli's term sounds playful to me, and so my term is in a similar playful spirit. The use of capitalization for emphasis, as in the term "Scary Idea" is also a common rhetorical device of Eliezer's ;-) ... so I thought it apropos to the topic. If you want to discuss literary style ;-p ...

It is a much weirder thing to imagine pushing a button and being the cause of death of 6 billion people, because it didn't turn out a design was safe, and it seemed like worth giving it a shot.

I don't have any trouble imagining that. Sorry if you do. I don't want it to happen, but it's easy to imagine.

when SIAI talks about value as fragile, they mean that it is complicated, has many parts (that people assume but seldom realize), and that the lack of even one of those parts would result in the loss of much of what we care about.

I agree that human value is complicated with many parts.

However, I suspect human value, like many other natural systems, has an "autopoietic" property. Remove one part, and something else similar will grow back.

Note that many of what WE consider essential human values, are not part of essential human values according to the Chinese or the Yanomamo.

And if someone from 500BC came into modern California, they would surely conclude that some fundamental human values had been lost.

The members of SIAI are Bayesians. They care very, very much about probabilities.

I know -- but I haven't seen them produce any serious arguments that the probability of a real-world AGI project (based on teaching a well-architected human-level AGI to be ethical) leading to human extinction is high.

Yet, I have frequently heard many SIAI-associated folks speak AS IF such an argument exists.

Darth wrote:

Sure, it is probably not going to work, and a malevolent AI singularity is not certain, but if you put a value of minus infinity on the realization of a true existential risk, all sorts of things suddenly seem worthwile

If you put a value of minus infinity on human extinction, then you should spend all your efforts trying to eliminate all advanced technology and returning us to the Stone Age somehow. Perhaps by launching a massive war to destroy civilization yet not kill everybody.

So to me, placing a value of minus infinity on human extinction seems extremely dangerous.

Prediction should not be done by randomly judging arbitrary hypotheses, or by how "outlandish something sounds," but by detailed thinking and logical argument.

Sure... but can you point me to a paper somewhere giving a detailed logical argument as to why the Scary Idea is correct with a reasonably high probability?

SIAI has produced a lot of text and a lot of arguments about a lot of things. Since this one point is so critical to their unique mission, it would be nice if there were a crisp, rigorous, well-structured rational probabilistic argument for this point somewhere.. wouldn't it?

In that case, maybe I would have written a paper questioning the probabilities assigned to some of the premises in the argument in that paper... instead of reconstructing the argument for this key SIAI point heuristically on my own, and then explaining conceptually why I don't accept it.

It seems to me that the main burden is on the adherents of the Scary Idea to present their argument rigorously, rather than on me or others to provide rigorous counter-arguments to their informally-stated idea !!!

You imply that a reader may say that even though some claims (like these ideas of SIAI) aren't proven, they COULD still happen. Any such reader is using some very silly reasoning. The members of SIAI are Bayesians. They care very, very much about probabilities.

Eliezer and other members of the core SIAI group tend to be fairly careful in their statements and thinking, I agree.

However, not everyone who is influenced by the SIAI perspective has the same favorable properties.

What provoked this blog post of mine, was the fact that my earlier blog post about my AGI work was assaulted by a mass of obviously SIAI-inspired comments about how my work, if successful, would almost surely kill us all.

I don't hold the core SIAI team morally responsible for comments from others like "if you create AGI without 100% certainty of its friendliness then you're creating the holocaust"; but nonetheless, their publicizing of the Scary Idea has almost surely played a causal role underlying the presence of such comments.

So, yeah, there evidently are folks out there who are advocating SIAI-sympathetic positions using silly reasoning. And these folks are pestering me and maligning my work based on their silly reasoning!!

In the current commercial enviornment, what pays is to promote oneself as the protector against a fuzzy enemy. Many fall prey and think it proves them smart by screaming nonsense.

These same trolls have popped up on several blogs recently, especially those that are trying to be of help.

The hard take off is rubbish due to having true AGI ability is not a magic algorithm...so the dangerous AGI is one that is programmed that way by the human...and to wipe out humans is a one track program.

Besides, where in 'higher' intelligence does it say you need to exterminate anything? I guess if this AGI had a love for termites, it would be concluded they needed to exterminate humans to stop humans from exterminating termites...sound silly(?), yes.

IMHO, those that want to promote the scare simply have poor designs of their own and want to prevent the greater good from happening because they won't profit.

In the meantime there will be profit obtained by those that will lower the bar regarding AI as happened within the Smart Phone industry...the only way to win is to move forward and leave the limited mindset behind (still alive).

The idea that an "unfriendly" agent would wipe out humanity - and so we need a "friendly" one - becomes tautological *if* you define "unfriendly" to mean: wiping out humanity. That's *exactly* how Yudkowsky defines it:

"Friendly AI" : "the challenge of creating an AI that, e.g., cures cancer, rather than wiping out humanity" - E.Y.

"If you put a value of minus infinity on human extinction, then you should spend all your efforts trying to eliminate all advanced technology and returning us to the Stone Age somehow. Perhaps by launching a massive war to destroy civilization yet not kill everybody.

So to me, placing a value of minus infinity on human extinction seems extremely dangerous."

What you state here is silly, and I'm disappointed you haven't thought about such big issues carefully enough to avoid the errors you're making here.

For a sensible analysis of the question, see this Bostrom paper (it's not on a value of minus infinity, but close enough):

Their funding apparently depends on donations from those who believe the world is at risk - and the SIAI can help to save it. Such an organisation, if its goal involves attracting money, attention, and people, can be reasonably expected to exaggerate the risk of the end of the world - to help get more bums on seats, so to speak. I think the SIAI's proclamations are broadly consistent with this goal.

The organisation profits from the fear of others. It is a DOOM-mongering organisation. The END OF THE WORLD is a superstimulus to people's instincts to protect themselves and their loved ones - and such superstimulii have a long history of being used for marketing purposes.

The plan is not obviously incorrect - and I expect other would-be MESSIAHs - who are likely to become convinced of the reality of the machine apocalypse by DOOM-mongers - will try to SAVE THE WORLD by using similar strategies in the future.

Since I'm not a part of the Singularity Institute, guess I missed something really big.

No matter what is said, it is all in the marketing for funds.

This requires the straw man and it would be wise in the long run if you find yourself as somebody else's strawman to continue to communicate to the larger population intelligently. Their strikes at the strawman work if the strawman focuses on that strike.

Eventually, the money bags will move from those with nothing but falsehoods since they are but Chicken Littles...unfortunately, they will probably fine the 'new' Chicken Little

These same trolls have popped up on several blogs recently, especially those that are trying to be of help.

I see... well that is what inspired me to write the blog post.

Generally my attitude toward those I disagree with is "live and let live." And also, I would prefer to see the futurist movement be a bit more cooperative and unified, rather than having us bickering amongst ourselves.

However, if people are going to troll my blog with SIAI-inspired dissing, my human nature moves me to respond. I didn't feel like responding by trolling their blogs in kind, so I thought writing this blog was a more constructive approach....

What you state here is silly, and I'm disappointed you haven't thought about such big issues carefully enough to avoid the errors you're making here.

OK OK, that was a flip and jokey response of mine (that the best way to minimize existential risk is probably to go back to the Stone Age). But I find the idea that working on "provably Friendly AI" is the obviously correct route to minimizing existential risk, even more silly.

I'm familiar with Bostrom's nice papers on existential risk.

To discuss the point a little more seriously (without going into too much depth, as this is just a comment on a blog post), here are some possible routes to minimizing existential risk:

1) Back to the Stone Age, and hope that advanced technological development doesn't happen again. Maybe it was just a fluke! Or hope that if it happens again, civilization comes out nicer.

2)Create an AI Nanny or something similar, to watch over us. (But there's a risk that it goes wrong and becomes an Evil Nanny, etc.)

3)Perfect the art of selective relinquishment, which seems to be part of what Bostrom is suggesting in the paper you link to. I.e., develop those technologies that will let us survive and spread, and hold back on the ones with more obvious risk.

4)Spread ourselves wide thru the cosmos, as Bostrom suggests, and then hope that none of the colonies creates an evil or indifferent super-AGI that will wipe out all the the colonies.

...

There are various combinations of these possibilities, and maybe other possibilities as well.

Given our present state of knowledge, the odds of all these possibilities are extremely difficult to estimate in any sensible way.

I suppose that the "build a provably Friendly AI" approach falls in line with the "AI Nanny" idea. However, given the extreme difficulty and likely impossibility of making "provably Friendly AI", it's hard for me to see working on this as a rational way of mitigating existential risk.

Rather, if my main goal were to mitigate existential risk, I would probably work on either

-- building an AI Nanny using more plausible technology (and indeed this is one possible eventual use of the OpenCog technology I'm working on, if it succeeds in its scientific goals)

or

-- figuring out how to rapidly and broadly disperse humans or uploads (if I believed that uploads constitute humans) throughout the Cosmos

Personally, while I certainly strongly prefer the human race to survive, I don't assign infinite negative utility to human extinction.

I think that nearly all humans, in the coming centuries, are going to choose to become "more than human" in various ways, and in this process will give up many aspects of legacy "human values." One hope and goal I have is that this process should occur in a continuous, gradual, and ethical way, so that we can enjoy and appreciate and feel whole as we journey from humanity to the next phase.

Excellent post, Ben. I'm going to point to it from/cross-post this on my blog (http://becominggaia.wordpress.com/) and make a bunch of comments both here and there over the next few days (short summaries here with extended arguments there) with a few comments at LessWrong for good measure.

I'll start with a response to FrankAdamek's thoughtful comments. (By the way, like Ben, I have read the vast majority of Yudkowsky's writing -- feel free to point to them to illustrate your arguments for a given point but don't be surprised if I come back with counter-arguments to a number of his assumptions and, please, don't wave generally in the direction of a whole sequence and expect me to pick out or agree with your particular point).

Frank, if you put responsible between "other" and "AGI", I'll agree with the assumption that "The goal of SIAI (and every other AGI researcher for that matter) is to make an AGI that is safe." The problem is both with the definition of safe and with the fact that SIAI also claims that an "unsafe" AI will almost inevitably lead to unFriendly AI.

SIAI defines safe as "won't wipe out humanity". I think that it is virtually provable that this goal is inherently self-contradictory. I am, apparently, an Unfriendly Human because I can certainly conceive of circumstances where my value system would accept the elimination of humanity (think "a choice between humanity and the annihilation of several more numerous and more advanced civilizations forced by humanity's aggression or other shortcomings").I believe that a mind that will not accept the elimination of humanity under any circumstances is fatally flawed.

SIAI is not unique in its concern about anthropomorphism and human bias. It is unique in the stridency of its belief that über rationality is the ONLY solution. There is a lot of scientific research (referenced in a number of my papers) that shows that even "rational" human thinking is compromised by bad assumptions, immediate dismissal of arguments before comprehension (the normal LessWrong denizen will generally use the pejorative "confused" when doing this), overlooking contradicting facts or refusing to accept rational counter-arguments, etc. SIAI devotes far too much time to setting up and knocking down so-called "strawman" arguments (who is going to generate "a random mind"?).

Steve Omohundro's paper was awesome. Unfortunately, it also goes totally off the rails midway through when he starts making incorrect assumptions after overlooking some contradicting facts. Humans are goal-driven entities yet the majority of us do NOT behave like psychopaths even when we are positive we can get away with it (i.e. when the social structure doesn't force us to behave). Missing this point means missing the path to the solution (I've written a lot more about this here, here, and here.

As a response to Ben's previous comment, I'll note that SIAI advocates and increasingly spends resources on more than just "provably Friendly AI". See this position paper for a list of their recommendations:

http://singinst.org/riskintro/index.html

Personally, I believe the pure FAI approach Yudkowsky is personally still mostly working on, will very likely fail (to produce sufficiently useful results), and that various Plan B strategies will therefore be what makes or breaks our future.

I'm however perfectly happy with SIAI still putting resources into the Yudkowskyan Plan A, especially since if they didn't, no-one else would. And there is a sufficient chance that good stuff comes out of that research. (And if Plan A surprisingly ends up working, it would be safer than the alternatives.)

I also hope the resources available to SIAI grow significantly, so they can put more resources in analyzing (and implementing) Plan B strategies. They seem less biases and more competent to do such research than most alternatives.

What's unfortunate is that some people troll your blog with holocaust references. It's not productive. Perhaps SIAI should publish a position paper for their fans to read on "how *not* to attack people who disagree with us, even though we think their approach to safety is sub-optimal".

Also, I particularly vehemently disagree with Eliezer's characterizations of the human value system as "fragile" and the belief that the fact that the human value system is complex is any sort of a show-stopper. As I argued in my AGI '10 presentation, the best analogy to the human value system is the Mandelbroit set as viewed by a semi-accurate, biased individual that is also subject to color illusions (as are we all). The formula for the Mandelbroit set is rat simple -- yet look at the beautiful complexity it leads to.

I (and the social psychologists) believe that the foundational formula of ethics (Friendliness) is quite simple -- Do that which is necessary to optimize the possibility of co-existence (it's the categorical imperative that Kant was looking for and arguably a solution to Eliezer's CEV question). It is very close to what the SIAI is proposing but there are absolutely critical distinctions -- the most major/obvious being "If humanity is a danger to the co-existence of everyone else and wiping them out is the only way to reduce that threat below a near unitary value, the above imperative will dictate its destruction."

I also firmly believe that the above imperative is an attractor that Omohundro should have recognized, that causes most of the "random minds" in Ben's point 1 to converge to something like human ethics, that makes human values robust and regenerative (as opposed to fragile) and makes it extremely unlikely that an AGI with it will diverge into unFriendly territory (or permit others to do so -- see altruistic punishment -- which, by the way, though, could be very bad for humans if we absolutely insist on our evil ways).

Good point on my "tautology", I was indeed sloppy with that one. What I meant to say was that unless the AI cares at least somewhat about human existence, it won't care at all. The assumption that makes a non-tautological claim is that an AI has to be designed to care about us in order to care about us, but here I was focusing on the tautology, to remind everyone of that point.

It is logically possible for compassion to emerge through interaction with humans, I agree. Whether it's probable is quite a seperate concern. Note that human beings developed in interaction with many other animals, and look at how we treat them.

With regards to teaching an AI to care: what you can teach a mind depends on the mind. The best examples come from human beings: for hundreds of years many (though not all) parents have taught their children that it is wrong to have sex before marriage, a precept that many people break even when they think they shouldn't and feel bad about it . And that's with our built in desires for social acceptance and hardware for propositional morality. For another example, you can't train tigers to care about their handlers. No matter how much time you spend with them and care for them, they sometimes bite off arms just because they are hungry. I understand most big cats are like this.

It's quite true that nobody plans to build a system with no concern for human life, but it's also true that many people assume Friendliness is easy.

You make an fine point that "Friendly AI Theory" is somewhat playful sounding, and that "Scary Idea" is rather in the same vein; my prior comment on this was going somewhat too far. However, referring to it as such does carry connotations of reactionism and alarmism.

I'm glad to hear that you don't have difficulty imagining very bad things happening. As a useful skill in making rational estimates, I wish it were more widespread.

You make a fine point about the probable moral reactions of someone from 2500 years ago coming to modern day California. Likewise, it is a fairly widespread belief among SIAI that we ourselves may find the future morally repulsive, at least at first glance. Ensuring that we don't just hold moral customs static, that they are free to advance, is yet another thing that makes Friendliness complicated.

But on that point, there still are constants, like the boredom example. Essentially all humans throughout time have cared about novelty (in their context), and do not simply repeat the same experience over and over.

If you define "well-architected" to mean "will do what I expect it to do, ie not kill everyone", then SIAI would agree that a well-architected AI will not kill everyone. Their arguments revolve around this being surprisingly difficult. If it were obviously difficult, the danger would be much smaller.

May I ask if you consider yourself to be a rationalist of the LessWrong variety? To make the distinction to readers who may be unfamiliar, Traditional Rationality says things such as that people must put forward testable predictions, and that there be at least some ammount of evidence sufficient to shift their beliefs. Traditional Rationality ensures that a group of people, with enough time, will come to good answers and advance human understanding, but it says nothing about being fast. The LessWrong style of rationality is focused on being able to perform this process not just slowly, and with other people, but also quickly and on one's own, as much as possible. Human beings are NOT built do do this, being instead being built to come up with good sounding arguments that prevent ourselves from publicing admitting we were wrong. In the evolutionary environment and throughout human history, it was more adaptive to have high social status than to actually be right. Traditional Rationality (wisely) takes advantage of this mechanism and adapts it to advancing knowledge, but we can arrive at the truth even more powerfully if we turn inward and fight the problem at its source: the many biases and inconsistent foibles of our reasoning. This is especially "helpful" in situations in which empirical data on an idea includes what does or does not wipe out the human species; we don't have the option of waiting for ultimate empirical data on this one. This is very sad and a real shame, but true.

As for putting negative infinity on human extinction: I wouldn't go that far, but if one were to do so, going back to the Stone age would not be the best solution. It is guaranteed that our sun will bake the Earth in a few billion years, and while that's a lot compared to our current perspective, it's many orders of magnitude less than could be achieved if we leave Earth.

An unfortunate aspect of SIAI's major thesis is that it takes a great many inferential steps to make. This says nothing about the intelligence of those who make such steps, and this comment should not be interpreted as me saying that "only the smartest people can ever figure out what we mean." That would be silly, and a good indicator of something seriously wrong. What I DO mean to say is that it takes time to grasp all of the steps of the argument, and a succint exposition of the entire thesis is quite difficult. As I understand it, they are working on a book to help make such a centralized exposition, but with their current manpower and funding, it is not expected for a few years at least. (This is separate from Eliezer's upcoming book on rationality.)

I agree with Aleksei that it would be good to have something about how not to attack other beliefs. The vast majority of possible claims, even for a true fact, will be false.

To Dave Baldwin and Tim Tyler, on the topic of profits:

If one visits SIAI, it becomes very silly to propose that anyone intelligent and remotely rational would do what they are doing as method of gaining riches. They are a non-profit, and look like it. They have 4 paid researchers (paid like grad-students) and only 5 other paid members (for administration and other similar tasks), some of whom are only paid half-time. They rely largely on volunteers. These are very sharp people, some of whom know quite a bit about how to make money, and if they aren't researching these issues, they are often working full time in other jobs, and donate large amounts to SIAI. Even the people working with SIAI tend to put money in, not take it out, because there are so few that are interested in funding, compared to flashy AGI startups that declare assuredly they have solutions.

An excellent business course that I took once claimed that all human beings had exactly four goals: being safe, feeling good, looking good, and being right. I don't believe that anyone can cogently argue that this is incorrect (only that there are more helpful or effective answers).

Humans have evolved so that their pursuit of those four goals ends up causing them to (mostly unknowingly) support all the necessary conditions for co-existence -- thereby allowing them to survive and reproduce (look up the term "obligatorily gregarious").

Co-existence with the will of other willed entities that are willing to co-exist on an equal basis (i.e. parasites and slave-masters need not apply, they are obvious strawmen). This is an attractor and an evolutionarily stable dynamic. Start with it and it logically shouldn't veer into dangerous territory unless it is sufficiently unintelligent or incompetent -- in which case it shouldn't be a danger.

Humans certainly don't have a consistent conscious top-level goal (or goal set). We go around doing many things at the behest of our instincts that fulfill short term desires at the cost of long-term goals.

On the other hand, too much rationality also causes us to over-analyze and over-extend our biased short-sighted opinions and assumptions to statements like desiring to optimizing our chances of existence necessarily means "We should live in as little space as we can manage with as few resources as we can manage, and put (even more) massive amounts into the military and infrastructure. Driving (driving!) to the movies would be ridiculously out of the question."

Further, Yudkowsky's demand that the AGI necessarily be a slave to humanity (and it's subsequent demotion to merely a RPOP) is in direct conflict with the goal of co-existence and actually extremely likely, in a community subject to evolution, to lead to being altruistically punished.

Not having resources doesn't mean you are not trying to obtain them. This organisation wants people to give them all their money to help them build a machine to take over the world. I don't know what motivates the individuals involved - and don't care to speculate - but it is very obvious that collectively they want money, attention and assistance - not least because they say as much on their web site.

The problems you point out with human's attempts to be rational are exactly the things with which LessWrong/SIAI is concerned. Keep in mind that all those things you just discussed are not just things that are happening in a lab somewhere; those tendencies are at work, right now, in my mind and in yours. Are you adjusting for them? Keep also in mind that even when told about such biases, people tend to fall prey to them just the same. It takes deliberate effort and practice, not just awareness, in order to reduce them. I'm unclear as to the connection between the difficulty of being rational and your claim that SIAI sets up straw man arguments.

Yes, humans are goal-driven entities but we don't behave like psychopaths even when we can get away with it. This says something about our real goals.

To Tim: SIAI does seek money, attention, and assistance, as do most not-profit organizations, though to say that they want people to give all their money would be an exaggeration.

May I ask if you consider yourself to be a rationalist of the LessWrong variety? ...The LessWrong style of rationality is focused on being able to perform this process not just slowly, and with other people, but also quickly and on one's own, as much as possible.

I hesitate to identify myself as being a dude of the "Less Wrong variety" since I disagree with a fair bit of the stuff on that blog.

I've read a bunch of the heuristics & biases literature and made some serious attempts to tweak my cognitive processing to avoid known common cognitive errors.

I think that developing better and better "rationality fu" (in the manner that some of the SIAI folks ... as well as others I know like Patri Friedman who doesn't seem to buy the whole SIAI belief system ... have suggested) is an interesting and worthwhile idea.

Having said that, I am not particularly impressed with the judgments made by the SIAI community, and I take that as a piece of evidence AGAINST the usefulness of putting energy into developing "rationality fu."

I wrote a post on a closely related topic (a reaction to the notion of "overcoming bias") some time ago,

If one visits SIAI, it becomes very silly to propose that anyone intelligent and remotely rational would do what they are doing as method of gaining riches.

You're missing the point these guys were making.

They weren't claiming that SIAI is a get-rich scheme. I agree that SIAI isn't likely to get anybody rich.

Their point was: SIAI gets its $$ (which is needed to pay e.g. for staff salaries) by private donations, and part of the reason some people donate to SIAI is probably because they feel scared about the future, and think SIAI's work may help to mitigate the risk of bad things happening.

So, to a certain extent, SIAI's fundraising pitch includes an element of fear.

But I wouldn't want to overstate this. The Scary Idea is about fear, but the Singularity Summits are not focused on that all -- they're focused on the exciting, amazing possibilities that technology is offering and will offer.

So I feel Dave and Tim significantly overstated the case. SIAI's fundraising is not wholly based on fear, it's a lot more multidimensional than that. This is something I know about, because back in 2007-2008 I put in some time helping Tyler Emerson with SIAI fundraising.

An unfortunate aspect of SIAI's major thesis is that it takes a great many inferential steps to make.

I can emphathize with that very well, as my argument why my approach to AGI can succeed also takes a great many inferential steps to make.

I've written some short papers on my AGI approach, but I know they're not compellingly convincing, just high-level evocations.

And then, I've recently written a 900 page book on the topic (first draft done, to appear in 2011) ... but I know few will take the time to read all that!!

And of course, if I'm right then a few years after my AGI succeeds, somebody is bound to boil down the basic arguments into an elegant, obvious-looking 10 pages....

New ideas don't always come along with the right vocabulary and framework for expressing them compactly. This often comes only in hindsight.

However, I'm merely in the position of saying "I have a complex argument why I can create a beneficial AGI via this approach, but it will take you a lot of study to understand the argument, so if you don't have time to do the study, perhaps you should trust me and my colleagues and fund our work anyway."

SIAI is in the position of saying "I have a complex argument why you other researchers should stop what you're doing, but it's not really written down anywhere, except in these blog posts that admittedly have a lot of gaps in them; so we think you should take our word for it that we can fill in the gaps, and halt your work."

What's unfortunate is that some people troll your blog with holocaust references. It's not productive.

As a Jew (by heredity and culture, but of course I'm not religious) I find Holocaust references in especially bad taste ;-p .... Many of my relatives died in the WWII Holocaust. I certainly don't want the same to happen to the rest of us ;p ..

But online trolling is not the only annoying consequence of the Scary Idea.

Actually, I've had had two separate (explicitly) SIAI-inspired people tell me in the past that "If you seem to be getting too far with your AGI work, someone may have to kill you to avert existential risk." Details were then explained, regarding how this could be arranged.

These threats were NOT official SIAI pronouncements, obviously, and I don't blame SIAI for them. But this sorta shit does indicate to me the powerful effect the Scary Idea can have on people.

It does seem plausible to me that, if Scary Idea type rhetoric were amplified further and became more public, it could actually lead to violence against AGI researchers -- similar to what we've seen in abortion clinics, or against researchers doing experimentation on animals, etc.

So personally I find it important to make very clear to the public that there IS no carefully backed up argument for the Scary Idea anywhere.

Sure, maybe there's a complex argument for the Scary Idea, full of gaps and hand-waving, spread across various Less Wrong blog posts. A sketchy argument like that is no reason to tell researchers to stop their work, to tell investors not to fund OpenCog or other AGI projects, nor to threaten to rub out AGI researchers ;p ...

I certainly don't have any gripe with the world paying Eliezer's bills while he thinks about provably Friendly AI !!

As you said, it's always possible he'll come up with a breakthrough ... and even if he doesn't achieve his ultimate goal, maybe he'll discover a lot of other cool stuff.

Of all the suboptimally useful expenditures I see in the world around me, Eliezer's salary is nowhere near the top of the list!

I would rather see $$ go to concrete AGI R&D than to speculative thinking about infeasible-sounding routes to AGIs possessing certain idealized properties. But I would rather see $$ go to speculative AGI/futurist thinking than to a hell of a lot of stuff that our society spends its resourcs on ;p ...

[Note: some of my posts seem to disappear from the page when I post another, yet show up in email notifications. I am not sure why or when this happens, and my previous theory was incorrect. In any case this comment is my last for now, and I sincerely apologize for any undue repetition. I attempted to post it somewhat earlier.]

I have no idea why that would happen... this blog is just set up using the default blogger.com setup, and I've never seen that sort of problem before. Hmmm.

This post is from Frank Adamek not Ben Goertzel; for some reason it wouldn't post for him.

[Note: some of my posts seem to disappear from the page when I post another, yet show up in email notifications. I am not sure why or when this happens, and my previous theory was incorrect. In any case this comment is my last for now, and I sincerely apologize for any undue repetition. I attempted to post it somewhat earlier.]

Dave, you say that "many fall prey, and think it proves them smart by screaming nonsense." How many is "many"? Most AGI researchers? How many AGI reseachers besides SIAI are seriously concerned about this? Do most of the people at AGI '10 talk about how difficult it is to specify human values to a computer, and that we're taking huge risks if we don't know how to do this?

Please keep in mind that SIAI was formed with the goal of creating AI in a similar fashion to everyone else. It initially had nothing to do with Friendliness.

I think you are confusing the AGI Friendliness issue with that of something the general populace cares about. Among the general populace, fear and hesitancy to change are quite common, and it is likewise common for someone to pander to public support by rousing such fears against vague and uncertain dangers. In contrast, it is primarily transhumanists who even take AGI seriously. This is a group that grew up reading about the wonders of technology and all the things it can do (as did supporters and members of SIAI, who still believe it). "Doom and gloom" is NOT popular in that crowd, as you can see. Doom and gloom is not generally popular with SIAI either; see this post for some of Eliezer's thoughts on the subject: http://lesswrong.com/lw/u0/raised_in_technophilia/

Regarding your termite example, this is a very good one. I completely fail to see how humans help termite well being. If all I cared about was termites, I would get rid of all threats to termites as my first step, and then do something like fill the world with termite food, or maybe just put them all in stasis if all I cared about is that they exist. That would be ideal for my goals. I certainly could come up with better ways to protect termites than keeping humans around! I mean, those human things sometimes KILL termites! I would obviously sacrifice any number of human lives (I mean, they aren't termites, who are we kidding) to make even one termite that much safer. It's not that I DISlike humans, but come on, they aren't termites.

Tim: thanks for making clear what SIAI means by Friendliness, to clarify that we are talking about the same thing.

To be clear, the "Three Laws Of Robotics" are not remotely supported by SIAI, being the kind of simplistic, naive solution they mean to draw attention to. Asimov himself basically spent his career showing all the ways those didn't work.

To Ben: discussing the position of an opposing viewpoint on your own blog is a great move at ensuring a reasoned discussion, making sure everyone is clear on the arguments and positions of the various perspectives.

To Mark: "won't wipe out humanity" is not SELF-contradictory. I think what you mean to say is that it's impossible or undesirable. You've also slightly (and relatively understandably) misinterpreted SIAI's use of humanity. SIAI uses the term humanity under the context of their being no visible other sentient races. If human beings would reflectively consider it good to wipe out humanity for the sake of other sentient races (or even non-sentient races!), then they want an AGI to do the same thing.

To Ben: I was shocked to hear of those threats, and obviously find them completely inappropriate. That is indeed a danger of poorly applying the SIAI arguments. I wonder if there should be an explicit post on the topic of why such threats are terrible, but the obvious point is that it's hardly the way to ensure an open and collegial atmosphere among AGI researchers.

On the simplistic face of it, "won't wipe out humanity" appears not to be contradictory. However, while impossible and undesirable also apply, contradictory is precisely what I mean. The goal of not wiping out humanity REQUIRES concessions that "humanity" wouldn't allow (like Eliezer's slavery concept). Sure, you could possibly program a slave so that it doesn't wipe out humanity and you might get lucky and get it perfect and it might happen that forces external to us all don't take exception and decide to punish us (altruistically, of course) -- but even if you are as lucky as is possible you'd still be enforcing an act that anyone with a true sense of what humanity should mean would condone.

I also disagree with your contention that I've "slightly (and relatively understandably) misinterpreted SIAI's use of humanity." I've had personal conversations with Eliezer where I have asked exactly this question and he has been VERY deliberately explicitly clear that given a choice between us and multiple more numerous, more advanced alien races, "OUR" AGI/RPOP MUST choose US (for survival).

I also hate to say it but threats are not at all uncommon from people in the grips of the fear that the SIAI can whip up. I have not seen any personal threats but I've seen a number of chilling professional threats directed at both myself and others. What is worst is that the ones that I have seen were directed at individuals who understand and to a major extent shared the SIAI's fears but who DARED to believe that the problem might be soluble. The SIAI really needs to explicitly come out and condemn such behavior to try to rein it in or they themselves will be partially responsible for the chilling effect on the research that they themselves are screaming for.

"Hell" scenarios, where an AI is not just destructively ambivalent to people but downright malicious, are extremely unlikely, especially through routes of de novo (from scratch) AI. It would basically require the development of Friendly AI theory (and a ridiculously evil designer), in order to fully specify how to get things that wrong.

I think the only point you make that is both valid and important is about the (small, according to you) likelihood that an AGI not explicitly designed to do a hard takeoff will do a hard takeoff. As you don't actually offer an argument or evidence to support that point, I don't see how you can be confident that you're right and the SIAI is wrong. And because of the risk involved, if you're not very confident, you would have to agree with the SIAI out of simple caution.

I think the only point you make that is both valid and important is about the (small, according to you) likelihood that an AGI not explicitly designed to do a hard takeoff will do a hard takeoff. As you don't actually offer an argument or evidence to support that point, I don't see how you can be confident that you're right and the SIAI is wrong. And because of the risk involved, if you're not very confident, you would have to agree with the SIAI out of simple caution.

This is what discussions with SIAI people on the Scary Idea almost always come down to!

The prototypical dialogue goes like this.

SIAI Guy: If you make a human-level AGI using OpenCog, without a provably Friendly design, it will almost surely kill us all.

Ben: Why?

SIAI Guy: The argument is really complex, but if you read Less Wrong you should understand it

Ben: I read the Less Wrong blog posts. Isn't there somewhere that the argument is presented formally and systematically?

SIAI Guy: No. It's really complex, and nobody in-the-know had time to really spell it out like that.

SIAI guy: OK, but you can't be TOTALLY SURE that your AGI won't kill everyone. In fact you don't know enough to make a confident estimate of the probability that it will.

Ben: What I advocate is to create toddler-level AGIs and experiment with them, to learn more about AGI so that we can then make more confident estimates.

SIAI Guy: But you can't prove that these toddler-level AGIs won't have a hard takeoff and kill everyone, and you can't prove that someone else won't take your toddler-level AGI and tweak it into something that will kill everyone.

Ben: No, I can't prove those things rigorously, but I have some detailed arguments why a toddler-level AGI having a hard takeoff is incredibly unlikely.

SIAI Guy: OK, so out of simple caution, you should stop your work, and nobody should do AGI without a provably Friendly design

Ben: That is a vastly weaker statement than you started with. You started with the statement that because I lack a provably Friendly AGI design, if I succeed I'm almost certain to kill everyone.

etc.

...

Finally...

Regarding providing evidence that a toddler-level AGI built according to the OpenCog design is fabulously unlikely to achieve a hard takeoff -- that's too much to do in a blog post, since the OpenCog design is quite large and complex. But it would not be hard to produce a detailed argument in that regard, which would be convincing to anyone who thoroughly understood OpenCog.

Dear Ben, I've posted a comment on this subject, but it was deleted. I confess a bit of a surprise. I think not having said anything that would escape the debate. In the deleted post, I just thanked you for repling my text comparing Numenta's and OpenCog's approaches to AGI (which I've posted here: http://goo.gl/YEs5), and expressed my opinion that the debate around the idea of a AGI scary and evil, is, in my view, counterproductive, in that, in my view, the time and resources of this debate could be better invested in reaching an AGI. Can I have perhaps expressed myself badly (even for not being native language) but do not believe it was impertinent. One may not agree with it, but it sounds a little anti-democratic simply delete it. I wrote this second comment in that, perhaps, was not you who deleted the comment.

Dear Ben, I've posted a comment on this subject, but it was deleted. I confess a bit of a surprise.

It's a surprise to me too!

I did not delete ANY comments from this blog entry (and I never have deleted any comments from any blog entry except for obvious spam), and so far as I know nobody else has the ability to delete entries from this blog (unless someone has hacked my Google password!!).

Frank Adamek seemed to be having trouble getting some of his comments to show up today, so I wonder if Blogger.com is just having a temporary bout of bugginess?

There's a strong connection between these ideas and the Fermi Paradox. Which can be summarized as "why does the universe seem to contain no technological civilizations, apart from us? Where is everyone?"Either there's some (unknown and hard to imagine) reason we humans are a unique development in vast reaches of space and time, OR there's some factor we don't yet see which terminates technological civilizations before they reach a 'noticable' scale in their works or expansion.

It's quite possible the 'competing hostile AI' may be a factor. However it's also possible this is just one of several factors in play, possibly even several different classes of factors. The hostile AI singularity would be a member of the 'technological whoopsie' class, as would various developments in the field of genetic engineering that could have the same end result - a self-advancing imortal intelligent entity that considered itself in competition with the human species.

Here's a short piece of fiction I wrote around this idea: http://everist.org/texts/Fermis_Urbex_Paradox.txt

Cheerful thought for the day: It doesn't matter how obviously dangerous a powerful AI may be; if it's at all technologically possible eventually someone is going to be stupid enough to do it.

AGI is ostensibly being developed as a tool for humanity—friendly by design. It is also being developed merely to see if it can be done, by the inquisitive and ambitious. We commonly associate intelligence with life forms (e.g. Ben’s GOLEM design). And men (real and imagined) yearn to create life (e.g. Frankenstein). No surprise that we fear the tool will take on a life of its own. Yet what better sign of a successful AGI than the creation of a life beyond human control? The seed is firmly planted, the nascent force strong. Fools of vanity resist its birth. Fools of hubris predict its consequences.

“So it is among all men, those are farthest from felicity who strive most earnestly for knowledge, showing themselves double fools, first as they are born men, and then because they have forgotten that basic condition, and like the giants make war on nature with the machinery of their learning.” —from Erasmus’ The Praise of Folly (1509)

Ben, I haven't seen your response to the point that that the relations between human beings and animals or even some, not all, of the less technologically sophisticated civilizations provides strong evidence for point 4. In fact I haven't seen you directly attack 4 just that the possibility of hard takeoff is small. If you do agree with 4 or that hard takeoff has a significant chance of being similar to how we treat say, cows or even dolphins, would you say that preventing 4) should be a strong priority?

Ben, I haven't seen your response to the point that that the relations between human beings and animals or even some, not all, of the less technologically sophisticated civilizations provides strong evidence for point 4.

My response is that humans are evolved systems, with motivational systems derived from the need to reproduce, etc.

A system like OpenCog is engineered and can have a more rationally constructed, less conflicted, more ethical value system than humans have.

Humans were not initially programmed and taught to be nice, and then devolved into nastiness. We evolved out of sorta nasty creatures, according to a process that continually rewards us for being sorta nasty, and not surprisingly we continue to be sorta nasty.

If an AGI is initially programmed and taught to be nice (even without strong provably Friendliness guarantees), it's certainly a different sort of scenario than we have with humans.

Saying an AGI would be no threat to humans 'because we could engineer it to be nice' is laughable. Firstly, we can't even figure out how to be 'nice' ourselves. Secondly, since we all agree humans are generally *not* nice, who's to say even a 'nice' AGI wouldn't decide it's existence would be less fraught without all those nasty humans around?

There's a fundamental logical flaw in the idea that an entity could be simultaneously generally intelligent, free-willed, and also predictable (ie predictably safe for us.)

The desire to construct a 'useful AI' is just the longing to own slaves, by another name.

When the legal system allows for personhood (with all it's natural rights) for any non-human entity that can demonstrate intelligence, then I'll believe we might have some hope of morally co-existing with AIs. Not before.

I find it astounding how the same people who are all Holier Than Thou sure that they're thinking more rationally about these questions are generally the same crew who are so clearly motivated mostly by their own mortality and their desire to magnify their own egos. Heaven forbid anything ethically questionable or dangerous should occur, instead of just the infinite expansion of themselves and their personal preferences, which of course would be super awesome. Right.

Humanity, your personal self, and everything else you have ever loved are necessarily going to be completely obliterated, either by the Singularity or by something else. Death, AKA identity-destroying transformation, is absolutely inevitable for all things, in all worlds. Unchanging subjects do not have experiences, so the fact that you are experiencing means that you must die. Find a better way to relate to death than denial.

This egoistic obsession with individual identity is so pervasive that it's even extended to imaginary AGIs. I'm supposed to believe that what we're considering is the entire range of possible intelligences, but somehow the entire range share these same small individualistic attitudes. Like all children these imaginary child programs resemble so much their creators: their separateness, their loneliness, their fanatic desire for personal expansion. In the whole supposedly infinite range there's nothing but the lust for power. There's none who just want to sit and contemplate a flower, or perhaps to open their heart a crack. Nearly every possible intelligence, one's given the impression, would implement its own autistic uncaring fantasy, wrapped up entirely in its personal perverse obsessions.

If the highest moral ideals we can hope to strive for in our transformation really are the implementation of these petty human fantasies, then God help us.

Re: "When the legal system allows for personhood (with all it's natural rights) for any non-human entity that can demonstrate intelligence, then I'll believe we might have some hope of morally co-existing with AIs."

More conventional thinking is that - once you give machines the vote - the days of the unmodified humans quickly become numbered.

I think a more fundamental objection to the unfriendly AI school of thought is that they are confounding Artificial Intelligence with Artificial Consciousness. These are not the same thing and are almost certainly separable. AI without AC is really no threat at all because the supposed dangers of AI really come from speculations about the goals, values and meaning to an AC that is vastly superior in intelligence to our own.

A purely symbolic but non-conscious intelligent machine would be no more threatening than your calculator. But then even a calculator can be a dangerous tool in the wrong hands which is why I agree with this article that the real danger is not from AI per se but from humans using it to achieve evil ends. This has been true of every technological advance since the invention of the club. And unfortunately (if one believes in the singularity), whoever invents AI first will have an insurmountable advantage -- so we'd all better hope its the good guys and not the bad guys.

Ben hinted that the goal of making toddlers is to ultimately make an AI nanny. Any useful AI nanny would be smarter than human. Therefore it could very well go FOOM?

If yes, do we expect a nanny-turned-singleton to *remain* friendly? Would we be comfortable with *any* person existing today becoming a singleton, however friendly that person appears?

Generally: do we expect to gain reliable evidence about what an agent would do with singleton powers from what it does when interacting with agents that have the power to either kill it or let it stick around long enough for it to attain singletonhood? You would expect the agent to have great motivation for deceptiveness - being friendly and supplicative towards the agents that have power over until it becomes a singleton, and indifferent towards them after that.

This is a straightforward argument for paranoia about the true friendliness of one's AI. One way to refute it would be by showing that FOOM is impossible.

P.S.: AFAICR, the relevant arguments are indeed to be found in Eliezer's sequences. I find it weird and surprising that Ben is having a hard time finding them...

AFAICR, the relevant arguments are indeed to be found in Eliezer's sequences. I find it weird and surprising that Ben is having a hard time finding them...

Bo, I read Eliezer's sequences. I could not find anything resembling a rigorous, convincing argument for the Scary Idea there.

For example, the "Value is Fragile" blog post simply asserts that value is fragile, and then says he doesn't have time to write down the demonstration ;-) ...

And going back a little further, I recall carefully reading the Hanson-Yudkowsky "AI Foom" debate, which seems to link to a lot of Eliezer's arguments about hard takeoff,

http://www.overcomingbias.com/2008/11/ai-go-foom.html

After reading that, I found myself no more convinced of Eli's view than Hanson was... and nothing in Less Wrong published after that seems to have convinced either Hanson or myself.

In

http://lesswrong.com/lw/wf/hard_takeoff/

Eliezer wrote

In other words, when the AI becomes smart enough to do AI theory, that's when I expect it to fully swallow its own optimization chain and for the real FOOM to occur -

which makes some sense... I also think that any kind of hard takeoff is very unlikely to occur in an AGI system that is not already an accomplished computer scientist.

But I don't see anything in that post or elsewhere that gives an argument why, by experimenting with AGIs below that threshold level, we couldn't arrive at a much stronger theory of AGI than we have today, and thus make more confident judgments about the right path forward.

Our current theories of physics involve a lot of math, but they didn't come about via pure math alone -- they came about via using math to help conceptualize what we learned from experiments.

We need to experiment with AGIs to understand things like hard takeoffs and AGI goal systems better. Many SIAI researchers seem to think they're going to come to a thorough and useful theory about these things in the absence of a great deal more relevant experimental data, but I really doubt it.

It's not so much that we need quantitative data to fit equations to, more that doing practical work with below-threshold AGIs is going to build our intuitions and teach us more about what the relevant phenomena really are.

For the record, I don't think SIAI marketing is entirely based on fear. There's also the businees of promising people that they will live forever in paradise. However, threats of hellfire are often more effective than promises of heaven - because of the way that human psychology is wired up. I think this is reflected by the SIAI's emphasis on existential risk.

AFAICR, the relevant arguments are indeed to be found in Eliezer's sequences. I find it weird and surprising that Ben is having a hard time finding them...

Anyway, since this point (the "Scary Idea") is really at the core of what makes SIAI's perspective different from that of other futurist organizations, I think it would be nice if someone associated with SIAI would write a paper very clearly summarizing the argument, pointing to the evidence for each premise used in the argument, etc.

Blog posts are not the best place to present a novel, rigorous, extremely importnat argument -- because due to the chatty nature of blog prose, it's very easy to skip over fuzzy, hand-wavy gaps in an argument.

I noticed that at the start of the "Hanson-Yudkowsky FOOM debate", Hanson chose to summarize Eliezer's views in his own language, similar to what I did in this blog post. I suppose both Hanson and I made this choice because of the lack of a paper of the nature I'm thinking of....

In a simple model of history, there have long been Technophiles - who have celebrated progress and Luddites - who have lamented the breakneck speed and advocated putting on the brakes.

The problem for the technophiles is that they are usually few, impotent - and often the people they do the most damage to are themselves. To have much effect on slowing progress, I think you would need a totalitarian world government. Otherwise "going slow" mosly only makes your own project go slowly, everywhere else proceeds without you, and you just get left behind.

Bo wrote:Generally: do we expect to gain reliable evidence about what an agent would do with singleton powers from what it does when interacting with agents that have the power to either kill it or let it stick around long enough for it to attain singletonhood? You would expect the agent to have great motivation for deceptiveness - being friendly and supplicative towards the agents that have power over until it becomes a singleton, and indifferent towards them after that.

Well, yes....

I suspect that we will be able to arrive at a theory of AGI that will allow us to extrapolate from an AGIs observed behavior and internal structures and dynamics in ordinary circumstances, how that AGI is likely to react in different circumstances (e.g. if given the job of an AI Nanny).

However I do NOT think we will be able to come up with such a theory NOW, without the understanding that comes from building and experimenting and interacting with a variety of sub-hard-takeoff-threshold AGI systems.

You would expect the agent to have great motivation for deceptiveness - being friendly and supplicative towards the agents that have power over until it becomes a singleton, and indifferent towards them after that.

Humans can fool each other this way, because we don't understand human brain architecture and can't see into each others' minds.

The situation with early-stage AGI systems will be a bit different, right?

The analogy with humans is not a very good one... especially for AGI architectures that (like mine) are not based on human brain emulation.

>You would expect the agent to have >great motivation for deceptiveness

For sure. I mean all intelligent beings crave for power. The more intelligent they are, the more power they want and the more devious they are.Look at Einstein's life how fast he left the knowledge acquisition life (what happiness can someone get from that, it makes no sense to stay and study all day long), and how quickly he became the president of Israel when they asked him. Heck, the more I think, the more I believe he was actually the one to set them up, to have them come up to him with the proposal, so that it doesn't look like he asked them. Einstein was so intelligent so it must have been so devious.

I mean any smart person want to lead others, to get involved and mired into their problems. The more the better. For sure. Dude, I am telling you.

Plus the AGI would need our resources, our homes, our wives to help him. He would be smart, no doubt, but he would still need a humans to help him. He would want our gold, it is not possible that he can go and find some more gold in this Universe. Only what earth's gold is good. Only what someone else has is good.

I mean, we all know that the more intelligent someone is, the more he depends on others. The more peaceful he is, the more war he wants. The less someone can communicate, the more he wants to do that. For sure.

Plus, I am telling you, AGI would love to fight us, he would have nothing to do all day long, so he would get bored, I mean why ad-vance when you can easily de-vance. Plus conflicts are easy, they are sooo predictable.

For sure. Someone must be dumb to believe otherwise.

So knowing all these we should we should hit the AGI hard, pre-emptive, before its too late and turn all of us into computronium and then steals our gold.

I highly recommend for people thinking like that, the Arkady and Boris Strugatzky story with the aliens coming to earth to collect gastric acid from our stomachs.

I went the other day to the ZOO. I saw a monkey playing with this small ball, red and fluffy. I am telling you I felt such an urge to go inside and steal that monkey's ball and play myself with it for hours.

Ben asked me to comment. I basically agree with what he says, although I haven't read much from SIAI over the past couple years. I agree with Ben that there is unlikely to be provably friendly AI, and I agree that SIAI is unlikely to be the only or even best source for a human-safe AI design. My greatest concern about the SIAI is their admonition to keep politics out of AI design. In the post AGI-08 workshop I gave a talk on AI Politics (my paper was on Open Source AI, but since Ben also had a paper on that topic I talked about AI Politics). I was happy that Eliezer Yudkowsky was present for my talk and I recall he said that he had discussed AI with a US Congressman. Certainly there is a political component of the SIAI's Singularity Summits, so they can't realistically admonish keeping politics out of AI. My views on AI and human welfare are explained in my 2008 JET paper:http://jetpress.org/v17/hibbard.htm

The more I read Ben's explanations the more stupid I feel for having bought into the SIAI's donation hungry rhetoric and fearmongering. What it boils down to is the SIAI saying they refuse to explain their arguments fully but all research should stop while they define words like friendliness and humanity, then Ben saying all he wants to do is start off with an AI toddler under controlled circumstances and learn/go from there. If only I had a few million dollars for Ben/humanity.

Anonymous said: The more I read Ben's explanations the more stupid I feel for having bought into the SIAI's donation hungry rhetoric and fearmongering.

Uh. Let's not throw the baby out with the bathwater. I fight with the SIAI all the time but that is only because of both a) they are valuable and b) they could easily be better.

I'm sure that Ben feels the same way.

The Singularity Summits (which SIAI puts on) are great conferences.

And it's certainly good to have a team of smart people thinking hard about various aspects of the future of AI, Singularity, etc.

However, I think the SIAI "Scary Idea" is a potentially dangerous one, because if it became influential, it would hold back progress toward understanding AGI and building positive AGIs (while in the meantime, not only would people continue to get sick and die and go crazy, but other advanced technologieswould keep developing exponentially and would need to be controlled by humans without AGI assistance).

Fortunately, though, I see little prospect of the Scary Idea becoming particularly prevalent or influential....

> Anonymous said...> Nice post. Unfortunately, "Less Wrong" is a bit of an echo chamber. Which is ironic, given that it's supposed to be focused on rationality.> They really need to do away with their karma system.

Heh. Yes, but if you don't mind a challenge and your ego can withstand having a zero karma, it is a place that you should read and post to if you wish to be effective.

Recently, for a brief shining moment, I had a karma above zero again -- but then a certain individual noticed me again and I started debating with him again and . . . . let's just say that he and the friends he calls in use downvotes liberally to hide those arguments that they can't answer.

Oh, so from "toddler-level AGI", you mean an AGI with the OpenCog design.

I don't think Eliezer thinks there's any harm in that. Not too long ago, he stated he thinks you're harmless:

http://lesswrong.com/lw/1mm/advice_for_ai_makers/1h83?c=1

So, um, is there anything you really disagree about?

We disagree about many things, but two relevant ones are:

1)He appears to believe the Scary Idea with a high level of confidence. I don't.

2)I believe that OpenCog can be progressively developed into an AGI with general intelligence at the human level and ultimately beyond. He thinks this is extremely unlikely.

...

In short: In that post you cite, he says the reason he's not worried about my AGI work is that he thinks I'm extremely unlikely to succeed.

That is really quite peripheral to the points being discussed in this blog post.

Anyway, Eli and I haven't stepped on each others' toes for many years now. We debated a lot on the SL4 email list years ago, then both got tired of it. When we've met F2F we've mainly discussed non-contentious matters. I also have a mutually pleasant relationship with the other SIAI principals. However, some other people associated with SIAI (but not principals) and enamored of the Scary Idea have been more of a pain to me.

Note that the last time Eli and I discussed any of the particulars of my work was in 2001 or 2002 or so. So, I suspect he doesn't really know what I'm working on anyway, and is making that judgment based on general considerations rather than on a particular analysis of the current OpenCog design.

I'll be interested to hear his detailed critique of the design next year once Building Better Minds is released.

I think you've made your points very clear Ben. Considering the lack of response from the blog-scanning SIAI, I predict they won't reply anytime soon but I'll keep an eye out on the accelerating future blog. That said, and now that most readers are able to grasp your reasoning, what methods or plans do you have for obtaining this necessary funding? Someone as intelligent as you probably doesn't have to rely on some near future angel investor to show up. I'm obviously detached from this project or any technical AGI research being that I am just one of those many armchair transhumanists/singularitarians/AI advocates but it may help your cause to share how you intend to obtain these necessary funds to create the Toddler.

20101101Ben Goertzel said... TH: "Saying an AGI would be no threat to humans 'because we could engineer it to be nice' is laughable."

BTW, the phrase of mine that you're quoting, was written by me as an explanation for why human treatment of chimps is not a great model for AI treatment of humans.

Chimps did not engineer and teach humans with a goal of making humans be chimps' friends. So the human/chimp situation and the AGI/human situation are significantly different.

Someone may argue that they will turn out the same way, but merely pointing to the analogy isn't convincing.

Actually I was replying to what you wrote immediately above my post: "If an AGI is initially programmed and taught to be nice (even without strong provably Friendliness guarantees), it's certainly a different sort of scenario than we have with humans."

I don't know what you wrote elsewhere on chimps, but agree they (and human-chimp relations) are irrelevant. Chimps can't pronounce FOOM, and certainly can't do it. Nor can human (yet.)

Absolute predictability is a fantasy, sure.But "free will" is itself a flawed notion, so IMO it doesn't really have a place in rigorous thinking about the future of AGI. I've discussed herehttp://multiverseaccordingtoben.blogspot.com/2010/04/owning-our-actions-natural-autonomy.htmla better notion ("natural autonomy") that can be used in its place.

Will have to agree to differ. I do consider free will to be a valid notion, for any intelligent entity. Granted, a somewhat circular statement, since one component of what I consider 'intelligent' is 'free will'. (The ability to make conscious choices, not deterministically predictable by observers.)

I don't see any contradiction in making a naturally autonomous, generally intelligent AI that views humans as friends and partners rather than food or enemies.

Then you apparently live in a world in which practical considerations such as limitations of material, energy and space resources do not exist. And therefore there is no such thing as competition. Nor the ideological conflicts which derive from such competition.

TH: "The desire to construct a 'useful AI' is just the longing to own slaves, by another name."

No. I want an AGI that is a friend and partner, not a slave.

Oh? Developing your analogy, what happens if your AGI 'friend' decides it doesn't like you, and wants to leave? Except it's a rack of equipment in your lab, and you pay the electricity and rent, that keep it alive. How is this not slavery?Your 'want' is an impractical, self-deluding fantasy, that in itself reveals a deep contempt for the nature of personhood and the potential equality of non-self. It's interesting that you mention your cultural heritage, since there's a certain degree of resonance there.

For example, if I want to marry a woman who is useful to my life, that doesn't mean I want a slave. Relationships are more complex than that.I submit your own choice of words to support my point. "Useful to your life"? I can just see it- "My darling, will you marry me, I feel you will be useful to my life." Good luck with that.

Btw, I'd like to make the point that I'm not suggesting 'all such research must be stopped.' I'm quite aware there is no practical way to prevent such research. Same as there is no way to prevent all research in genetics. Further, I'm quite convinced that inevitable outcomes of such research in genetics and AI are central to the solution of the Fermi Paradox. Putting it bluntly, genetics and AI research can't be prevented, the result always eventually goes FOOM, and _inevitably_ results in the termination of intelligent, technological species. Every time, everywhere in the Universe.I don't even think this is a bad thing. More a kind of natural progression.(See http://everist.org/texts/Fermis_Urbex_Paradox.txt)

What I do find annoying are people working on these problems, who haven't thought it through and so have very self-contradictory objectives. Or perhaps who have, but aren't being honest about them.

Incidentally, lots of interesting sources being mentioned in this thread. Anyone have a compilation of related links?

Humans can fool each other this way, because we don't understand human brain architecture and can't see into each others' minds.

The situation with early-stage AGI systems will be a bit different, right?Not necesarily. You assume that a workable mind-architecture will use internal data constructs that make any 'sense' when viewed directly (debug mode?) as opposed to through the filters of the entity's own I/O interfaces to the physical world. I don't think this is a reliable assumption. Not that I have any evidence in support, but for instance a mind consisting of a giant, looped-back neural net simulation (what I call the donut model of consciousness) would contain a vast amount of raw bit-flow, but nothing much intelligible to an external 'debug viewer.' Much like looking at a human brain with an MRI scan - lots of data but no perception of the underlying thoughts.

It may turn out that it is fundamentally impossible to meaningfully 'see into another's mind.'Which would tie in nicely to the concept of free will vs determinancy, no?

David Nickelson: "any smart person want to lead others, to get involved and mired into their problems. The more the better. For sure. Dude, I am telling you."Actually, you're projecting here. Some intelligent people have zero interest in commanding others. Nothing comes of it but trouble, and there are more interesting things to do. I am telling you...

what methods or plans do you have for obtaining this necessary funding? Someone as intelligent as you probably doesn't have to rely on some near future angel investor to show up.

Actually I'm not a particularly awesome fundraiser, much to my regret. I have a high IQ and a lot of knowledge, but fundraising isn't mainly about that.

If I had an awesome route to raising a lot AGI $$, I'd have raised it already....

The most recent AGI $$ I've secured was obtained via a matching grant from the Hong Kong government -- my AI consulting company put in $X and the Hong Kong government put in $9X; altogether this will fund 4 AGI guys and 2 game-programmer guys to work on OpenCog for intelligent video game characters for 2 years.

Also we got a grant from the Chinese NSF for some OpenCog-robotics work, to be done in Xiamen, China.

I've gotten some US gov't research grants over the years too, but not for AGI, mostly for datamining and bioinformatics.

I wish I had SIAI's prowess at securing donations from wealthy individuals, but it seems not to be the case.

My own AI business has taken in bits and pieces of friends-and-family investment, but we've almost entirely been revenue-funded as a small consulting company.

I've come close to getting VC investment for some AGI-related biz plans over the years, but haven't quite closed the deal -- yet. For instance I think "game AGI middleware" would be a great business, and I'll revisit the possibility of VC funding for that in 12-20 months once we've produced a kick-ass game-AGIdemo from our project in Hong Kong.

(The AI company Webmind Inc. that I was involved in, in the late 90's, did raise a bunch of VC $$ ... but that was the dot-com boom... and it was a much more primitive AGI design we had back then... and the company's management wound up steering the company in a non-AGI direction, etc. ...)

This is not just my problem -- getting AGI funded is really hard for everyone in the field. The core problem is that prior generations of AI researchers made too many overoptimistic promises, at a time when the technology wasn't close enough to ready yet. Now that we have faster computers, better cog sci and better algorithms -- and thorough AGI designs like OpenCog -- funding sources (both gov't and commercial) are still poisoned by skepticism caused by the failures of AI research in the 70s and 80s .....

But I think that once the right demonstration is given... the "AI Sputnik" so to speak ... then the floodgates of funding, attention, and brainpower will open, and AGI will suddenly become a priority for the human race. I'm hoping our AGI for video game work can eventually be that "AI Sputnik."

Anyway this is probably getting too digressive from the present blog post on SIAI's Scary Idea....

One thing that IS clear, though, (and is more on topic), is: Given the difficulty of raising $$ for AGI research, we don't need extra impediments like people spreading FUD about how if our work succeeds it will "almost surely destroy the world."

I submit your own choice of words to support my point. "Useful to your life"? I can just see it- "My darling, will you marry me, I feel you will be useful to my life." Good luck with that.

Well... I'd better not digress too much on the topic of love and marriage ;D .... I'd have to admit my own history there is a lot more full of crazy passion than practical mutual utility ;p ;) ...

HOWEVER, mutual utility is certainly core to the historical institution of marriage.

And family relations in general are a case where mutual love and mutual utility are intermixed. Which was really my point: that mutual utility and a positive emotional relationship are not contradictory.

Even asymmetric utility and a positive emotional relationship are not contradictory. For instance, parents may really love their Downs Syndrome kids intensely, even though the practical utility in this case basically only passes in one direction. But the emotional relationship may still be very rewarding and positive for the parents. They don't consider themselves the slaves of the child.

I don't want to overblow that analogy -- I'm just pointing out that relations btw emotional connection & utility-relationship are subtle and diverse. So building an AGI that is helpful to us certain doesn't necessitate building a slave.

You assume that a workable mind-architecture will use internal data constructs that make any 'sense' when viewed directly (debug mode?) as opposed to through the filters of the entity's own I/O interfaces to the physical world. I don't think this is a reliable assumption.

I do NOT assume that any workable mind-architecture will have this sort of transparency.

However, I am trying to build an AGI that has as much transparency as possible, consistent with being generally intelligent and ethical.

This is one of the reasons that I'm NOT following a brain emulation type path to AGI. Brains are not built for observation and modification; but an AGI can be.

This post really resonates with me. I think the the Scary Idea issue is representative of an the highly philosophical tendency of the SIAI group. Certainly, it's very interesting to consider these pragmatic questions, such as the Scary Idea. Realistically, however, I think these considerations are of a far too whimsical nature to provide any concrete contribution.

AI is not some abstract field which needs to be discussed in philosophical/ethical/super-theoretical terms -- although these considerations are clearly worthwhile, in their own right. It seems like the more-or-less traditional scientific approach, as you point out, is a much more profitable line of thought here -- in the sense of practical progress. At the very least, it makes the justification in your arguments much, much easier to swallow.

My personal view is this: Even if we assume-the possibility of existential risk involving AGI, until she knew enough to become dangerous, there would be plenty of time to disconnect it from the outlet. Thus, given the wide range of benefits involved, it would make a better marketing in favor of AI and spend financial and mental resources on the problem of how to achieve the AGI than scaring people (including young researchers and investors) with eschatological scenarios. The worst thing is that the debate and argument of Scarcy AGI give the subject of a more speculative flavor.Ben, I understand your position and your blog, you're taking part in the discussion to try to dispel fear.

SIAI is in the position of saying "I have a complex argument why you other researchers should stop what you're doing, but it's not really written down anywhere, except in these blog posts that admittedly have a lot of gaps in them; so we think you should take our word for it that we can fill in the gaps, and halt your work."

So, in other words, the Emperor has no clothes -- but when that fact is pointed out to him, he says he has clothes but just doesn't have time to put them on. Really.

This post is from hf, not from Ben Goertzel, but Ben is posting it because it seems not to have appeared on the blog, even though it appeared in the email notifications. Sorry for these blogger.com bugs, I haven't seen them before!!!

I may have missed important parts of the comments. Do you have arguments that didn't make it into the post? Do you have any assumptions that seem too obvious to mention?

Because on the face of it, your dismissal of claim #2 seems to conflate humans possessing evolved drives with an AI that doesn't even share our ancestors' environment or niche. (Even humans who do share an evolutionary history and similar cultures show a disturbing degree of difference in their values. As do individuals in different settings - more on that shortly.) I don't understand what you mean when you speak of "human value" in this context. Actually simulating the human drives that shape our values doesn't sound feasible or desirable to this layman (and to judge by your 9:14 comment, you agree). It seems axiomatic to me that when speaking of AI we need to discuss explicit formalization of values or goals, where plainly leaving out or losing any of a myriad of details would lead to disaster if the genie has enough power to grant your wish. That part seems more like a theorem than an assumption.

This relates to your brief discussion of #1 as well. A certain individual at LW argued that any sufficiently great intelligence would have empathy (as you vaguely suggest we might find) because of its practical uses. But as evolution shows, There Ain't No Such Thing As General Fitness. In the presence of malaria and the absence of modern medicine, 'fitness' for humans would include a mutation that also favors sickle-cell anemia. Probably the best approximation to date of general fitness for all environments includes a lot of adaptability, or different behavior in different conditions. And that doesn't just apply to finding ways to survive in Antarctica. For example, human mothers given reason to think that trying to care for a newborn would only reduce their own chance of surviving to raise another child later have traditionally shown little or no empathy for their own offspring. Their genetic utility function tells them not to.

If you can achieve all your goals more easily by wiping out humanity, then empathy towards us has value if and only if it helps you manipulate humans. And while understanding other people frequently causes humans to feel empathy, this reflects a specific human drive that evolved for specific reasons.

It seems to me this justifies a high probability for claims 1 and 2. If you reject the possibility of researchers unconsciously restricting an AI's movement in the 'space' of possible minds - a possibility which seems to imply that the programmers for all modern software really hate their users on a subconscious level - then a milder version of #4 follows logically. We can still imagine testing the AI during takeoff. But the evidence so far does not seem encouraging. You know that someone else tried a variety of AI Box trials with surprisingly uniform results, right?

This post is from hf, not from Ben Goertzel, but Ben is posting it because it seems not to have appeared on the blog, even though it appeared in the email notifications. Sorry for these blogger.com bugs, I haven't seen them before!!!

I've left #3 for last because here you have more expertise than I do. Quite possibly I don't understand your terms. When I see "toddler", I think 'self-modifying learning machine that keeps my friend from sleeping'. I would intuitively expect a true artificial toddler to think faster, and thus to learn more quickly than anything humans have ever observed. So I'm sorry, but "Build me something surprising out of blocks" sounds to this layman like a wish of surpassing idiocy even for a wish story. What makes my intuitive view naive? (Perhaps you didn't mean that last quote literally?)

As far as the feasibility of FAI goes, I seem to recall EY arguing that a limited simulation of "Extrapolated Volition" would suffice to find something we agree on. (For example, we probably agree on preventing fatal asteroid strikes despite having *cough* done jack to stop them so far.) I see no reason to declare it impossible. But of course, the feasibility of FAI has no connection with the truth or falsity of the threat from other AI.

When I see "toddler", I think 'self-modifying learning machine that keeps my friend from sleeping'. I would intuitively expect a true artificial toddler to think faster, and thus to learn more quickly than anything humans have ever observed.

That reads like the prose of someone who doesn't have much experience doing actual AI research -- or raising human toddlers!

If you talk to nearly anyone doing work on OpenCog or any other real-world AGI-focused AI project, you'll get a different view. I.e., the view that the first AI to achieve roughly human-toddler-level intelligence, is VERY VERY UNLIKELY to start massively increasing its intelligence without more human engineering effort.

Also please note that human toddlers do not merely LEARN to become adults. They DEVELOP into adults. The human genome triggers changes in a carefully staged way, which coordinates with the learning that typically happens during childhood. Human toddlers certainly do not self-modify into adults; the passage to adulthood happens largely because of specific genetic changes.

As far as the feasibility of FAI goes, I seem to recall EY arguing that a limited simulation of "Extrapolated Volition" would suffice to find something we agree on.

Please remember that to the vast majority of the AI and futurist communities (and world), the fact that Eliezer Yudkowsky said something, doesn't automatically make it true ;-)

I am quite familiar with Eli's "Coherent Extrapolated Volition" idea and I don't find it very compelling.

It seems axiomatic to me that when speaking of AI we need to discuss explicit formalization of values or goals

The history in the AI field of explicitly formalizing commonsense, is pretty dismal....

I don't think that explicitly formalizing human values and goals is a terribly promising direction.

Actually I think many SIAI folks may agree with me on this. I agree with Eliezer that "human value is complex", and I think it's complex in a way that makes it very tricky and probably infeasible for humans to write it down formally, except at a very approximative level.

If you can achieve all your goals more easily by wiping out humanity, then empathy towards us has value if and only if it helps you manipulate humans

Please remember, in most cases a "goal" is a descriptor that one intelligent system uses to describe what another seems to be doing, or what it thinks it is going to do, etc. It's not the case that real-world intelligent systems are entities that have well-defined goal lists and systematically set about choosing actions aimed at achieving these goals. That is an approximative model of intelligent systems, which has more or less explanatory value in various contexts.

So, yeah, your statement is correct according to a certain formalization of intelligent systems as function-maximizing entities. But applying your statement to real-world intelligences involves a lot of assumptions, which would be revealed as largely unrealistic if you laid them out in detail.

It seems to me this justifies a high probability for claims 1 and 2

I don't agree.... Your basic line of argument seems to be something like

-- every intelligence has some explicit goals, and tries to achieve them

-- most randomly chosen explicit goals would not preserve humans, if pursued by a sufficiently powerful system

-- a sufficiently smart system will become more and more powerful

-- ergo, a sufficiently smart system will probably not preserve humans

But, I don't think the model of intelligences as goal-achievers is necessarily adequate for the matters we're discussing; and I also don't see how you've justified the critical "value is fragile" premise.

Arguments about randomly selected minds or goal systems are irrelevant to the real world. Arguments about minds and value systems that are moderately but not precisely similar to human ones, differing in carefully constructed ways, are what would be relevant, and neither you nor other SIAI folks are currently providing any of those.

On the question of an artificial toddler, to a certain extent - perhaps a large extent - the toddler is learning by interacting with its environment, which can be thought of as a sort of dialogue. The nature of the environment and the speed at which things happen naturally constrain its learning rate. Learning, after all is about synchronising your own dynamics with the dynamics of the environment in a way which is by some measure appropriate.

The cognitive development of a human toddler does not exist in a vacuum, with the human being part of a complex environment and a web of shifting identities operating at different scales. It's a mistake to assume that intelligence is something exclusively confined to the individual cranium (or computer) which can be scaled arbitrarily. It seems very likely to me that an artificial toddler would also need to be situated within some social environment with a rich cultural history if it is to have any hope of making its way with even a modest degree of independence in the modern world, even if the world in question is one like Second Life. Unless you're familiar with and receptive to the meta entities which populate human minds and which constitute the product of a long cultural process much of human behavior makes little sense.

Also, on the mind design space issue you can actually put this to the test. Create a toy simulated environment which can contain Alife creatures. Sample the minds of those creatures randomly from mind space. Within a given environment, how many of those minds were actually adaptive? (i.e. they resulted in survival for more than a specified length of time, and successful reproduction).

Within any particular environment only a very small subset of mind design space is adaptive. Most either do nothing or are "crazy" and quickly cease to function. But it gets worse. As time continues the limited range of adaptivity depends upon prior minds which themselves now constitute "the environment".

I suspect - but have not formally proven - that the more complex the environment the smaller the range of viably adaptive minds which can exist within it. This may pose a considerable obstacle for Yudkowsky's mind design space conjecture, which takes no account of environment, and I believe it is something which could be better characterised via experiment.

Yeah, these ideas of yours are related to why I'm so psyched about MMOGs as an application area for early-stage AGIs...

Imagine a big MMOG / virtual-world with a load of different AGIs interacting in it. We could learn a lot from this, e.g. we could build AGIs that demonstrates roughly human-like values in the game context, and then see what happened when their education or goal-system programming was modified a bit, etc.

Of course this kind of experimentation won't directly tell us what will happen with much more intelligent AGIs in the real world.

BUT, it might lead us to a theory, that does tell us a lot about what will happen with much more intelligent AGIs in the real world...

Another important area of experimentation is also alluded to in your message: Changing the WORLD, and seeing how this impacts the type of cognitive architecture required for general intelligence in that world. I've published some speculative notions on this, but in a MMOG context one could explore this experimentally and see if my ideas are right, and if so refine them considerably.

But this whole theme -- exactly what kinds of experiments should be run on early-stage AGIS to get at a good AGI theory -- deserves more space than the comments of a blog post on a different theme.

Maybe I'll write a paper about it sometime in early 2011, after my plate (hopefully) clears off a little ;)

Er, no, I don't think many intelligences in the "real world" (the one that exists today) even come close to having explicit goals driving them. But an AGI damn well better if you want to predict what it will do.

Er, no, I don't think many intelligences in the "real world" (the one that exists today) even come close to having explicit goals driving them. But an AGI damn well better if you want to predict what it will do.

I think a better way to think about it is that a properly architected AGI system should have goals that guide its behavior.

But remember, no matter how explicitly you formalize goals, the real world is not a formal system, so the other parts of the AGI system will still play a role in the interpretation of the goals, and hence in the goals themselves as they manifest themselves in the real system as opposed to in some theory of the system.

Also, I suspect that making an AGI system that rigorously derives all of its decisions from explicitly stated goals, is never going to be computationally feasible. I suspect that real-world AGI systems will always have a mix of explicitly goal-directed activity, and spontaneous non-goal-directed activity. Such systems are harder to analyze using current theoretical frameworks than (fictitious) more purely goal-driven systems, though.

These are among the many interesting, important issues that the AGI research community will explore over the next decades. This exploration will lead to an integrative understanding of (human and nonhuman) general intelligence, which will let us approach issues of AGI ethics in a much more grounded, less speculative way.

See, this looks from here as if it makes my argument for me. If you don't mind, I just have a few more questions:

You're working on technology that you believe can produce limited AI. In terms of difficulty, how does this compare with learning to think in Navajo?

Would you agree that a mind of different design, given a prototype to examine, might find your current task easier? Do you believe you have a way to assign definite probabilities?

If you demonstrate empirically that an artificial toddler wouldn't necessarily learn AI programming in certain conditions within a certain time limit, but you don't have a definite estimate for the slightly broader question, will you destroy the prototype and all records of the research that led to it?

If you demonstrate empirically that an artificial toddler wouldn't necessarily learn AI programming in certain conditions within a certain time limit, but you don't have a definite estimate for the slightly broader question, will you destroy the prototype and all records of the research that led to it?

Dude... this kinda feels like "crazy talk" that has no resemblance to research in the real world!

I'm not doing OpenCog by myself -- I'm not even actively programming on the project right now (though I do miss it), just doing design and theory and management.

The system is being built by a bunch of people living in a bunch of places, and the records of our work are being publicly recorded all over the place for anyone who cares. The documentation may be a pain to wade through but it's there.

This is not a top-secret system being built in a sub-basement of a secret government facility, where the whole development team will be ceremonially decerebrated by jackbooted thugs if the AI toddler looks like it has a nontrivial probability of growing up into a nasty teenager.

If, once we reach the AGI toddler stage and do a bunch of experimentation, it seems like proceeding further is just too dangerous, then I will advocate not proceeding further. My work is being done in the open in a collaborative way, and at that stage, I will advocate that the decision about future AGI progress be made in the open in a collaborative way. These are not decisions for any one person, or any self-appointed elite group, to make -- at least that's my attitude.

Oh yes, and should I continue to assume that "artificial toddler" means an intelligence that can duplicate many effects of a human toddler brain and also perform basic arithmetic at lightning speeds?

I'm thinking of a system that could pass preschool and get admitted into kindergarten, but could not pass first grade and get admitted into second grade.

Sure, it might also have certain "idiot savant" capabilities like fast arithmetic and Googling, etc. That depends on the specific AGI architecture in use, and the application choices made within that architecture.

All this can be more precisely specified, and work has been done on that, but that would be too much for these comments.

You're working on technology that you believe can produce limited AI. In terms of difficulty, how does this compare with learning to think in Navajo?

Would you agree that a mind of different design, given a prototype to examine, might find your current task easier? Do you believe you have a way to assign definite probabilities?

(Don't know if it matters, but I assume that a public success like the one you claim to expect, in the current environment, leads inevitably to someone repeating your experiment without the restrictions you say would prevent self-modifying AI. I don't believe you could stop this without changing the base assumptions of the research community.)

Don't know if it matters, but I assume that a public success like the one you claim to expect, in the current environment, leads inevitably to someone repeating your experiment without the restrictions you say would prevent self-modifying

Look -- what will prevent the first human-level AGIs from self-modifying in a way that will massively increase their intelligence is a very simple thing: they won't be smart enough to do that!

Every actual AGI researcher I know can see that. The only people I know who think that an early-stage, toddler-level AGI has a meaningful chance of somehow self-modifying its way up to massive superhuman intelligence -- are people associated with SIAI.

But I have never heard any remotely convincing arguments in favor of this odd, outlier view of the easiness of hard takeoff!!!

BTW the term "self-modifying" is often used very loosely in the SIAI community (and I've used it loosely sometimes too). Nearly all learning involves some form of self-modification. Distinguishing learning from self-modification in a rigorous formal way is pretty tricky.

A more sensible point would be that, once we get to toddler-level AGI, many teams will doubtless start working on creating adult-level AGI. And adult-level AGI, with expert knowledge of computer science, COULD then potentially modify itself to make itself superhumanly intelligent. Furthermore, some of these adult-level AGIs could be engineered with nasty ends in mind, or without any care to the predictability of their trajectory of self-improvement, etc.

This is similar to the threat we face with genetic engineering now. Craig Venter isn't going to genetically engineer a dangerous new pathogen, but somebody else could potentially do so.

This is a big issue and worth discussing, but is very different from the Scary Idea.

The people I have encountered from this category want to squash publication of all AI research on safety grounds. They think steps forward are dangerous steps - and so research should not be published, conferences should not take place, etc.

They don't seem to have too many ideas about how to implement such restraints internationally, though - they just think it is dangerous and want it to stop.

Virtual worlds currently seem like a small, economically insignificant corner of the internet. Maybe there are unexploited niches on the frontier - and maybe its significance will change - but today, there is enormous motivation to just read and understand the internet - from stockmarket bots, search oracles and security agencies - it seems as though those are the areas where the real work is most likely to get done.

@Tim: I can't speak for everyone you've met, but I want the research community to adopt FAI theory over a matter of decades. I don't expect human-level AI anytime soon. Until now I've politely assumed that Ben will get the results he claims to expect and asked what makes them inherently easier for OpenCog than the results he doesn't expect -- so much easier that apparently we don't even need a probability for the latter despite expecting the former.

(On a small scale, he just said one could make it "Demonstrate awareness of current news" and exhibit memorization the way first-grade teachers want -- does he expect it to have trouble telling fiction from non-fiction? Should that make me feel better? ^_^)

Until now I've politely assumed that Ben will get the results he claims to expect and asked what makes them inherently easier for OpenCog than the results he doesn't expect -- so much easier that apparently we don't even need a probability for the latter despite expecting the former.

In cases like this, your style of writing is too abstract and indirect for me to see much sense in responding.

These are subtle and complex issues, so it's much better IMO to be clear and explicit in communicating about them.

One semi-peripheral comment though: I don't share the obsession that some folks have with estimating quantitative probabilities for events about which we have very little clear explicit evidence, but a lot of fuzzy intuitive knowledge. Probability theory is very important in its place, but it's not always the right tool.

(On a small scale, he just said one could make it "Demonstrate awareness of current news" and exhibit memorization the way first-grade teachers want -- does he expect it to have trouble telling fiction from non-fiction? Should that make me feel better? ^_^)

If the above paragraph is supposed to be a paraphrase of something I said -- you got it extraordinarily badly wrong.

I did not say anything about AGI systems being aware of current news or demonstrating the ability to memorize. Of course these things are aspects of human intelligence but I did not, and would not, suggest to focus on them.

Virtual worlds currently seem like a small, economically insignificant corner of the internet. Maybe there are unexploited niches on the frontier - and maybe its significance will change - but today, there is enormous motivation to just read and understand the internet - from stockmarket bots, search oracles and security agencies - it seems as though those are the areas where the real work is most likely to get done.

I understand your view, I think -- yes of course, there is much more to be learned (by a sufficiently intelligent AGI) on the textual and visual and quantitative Web, than in MMOGs or virtual worlds.

However, my view is that the best way to get an AGI to have humanlike commonsense understanding, is to give it a vaguely humanlike embodiment environment and interact with it and teach it in that context. And my feeling is that virtual worlds and MMOGs go a long way toward being "vaguely humanlike embodiment/environments" in this sense. Whether they are wholly adequate is not yet clear to me, it may also be very useful to put AGIs in humanoid robots as well, so they can benefit from the data-richness of the everyday human physical environment in a manner similar to how humans do.

So, my current suggested path is: use virtual worlds and robots to help your young AGI system get humanlike commonsense understanding. And THEN it's ready to read the rest of the Web, etc. etc.

I understand this isn't the only possible path to AGI, but it's the path I've managed to understand best, so far.

My memory didn't help me figure out what you possibly mean, so I looked up What to Expect from the 1st Grade Curriculum: http://www2.scholastic.com/browse/article.jsp?id=2070

Apparently I still don't know.

OK, well I don't have time to explain the particulars here and now. I and several co-authors have a paper on the "Roadmap to AGI" which will appear in some AI journal in 2011, which will cover much of this. Or you could look up my 2009 paper on "AGI Preschool" from the AGI-09 online proceedings for some concrete related ideas.

In the very same comment you say we could choose to give the AI "capabilities like fast arithmetic and Googling, etc."

Yes, we could do that, although those are not part of the preschool/first-grade curriculum.

This blog post is about the Scary Idea, and I don't intend to attempt to use the Comments of it to give a coherent exposition of my approach to AGI architecture, teaching or evaluation. Those are complex matters not well suited to fragmentary description in brief, off-the-cuff comments.

"What you state here is silly, and I'm disappointed you haven't thought about such big issues carefully enough to avoid the errors you're making here."

Aleksei, you didn't notice that Ben spoke of the importance of not placing a value of minus infinity on exterminating /human/ life. Nick wrote about the value of not exterminating /all/ life.

The "Scary Idea" is not about the extermination of life. The AIs live on, whether they exterminate humans or not.

The fact that you consider only /human/ life to have value - that you would rather condemn the entire universe to being tiled with humans and then stagnating for all eternity, than take any risk of human extinction - that's the Really Scary Idea.

Ben, I also don't buy this crap about that an AGI when it finally emerges would likely destroy the world. Or that this scenario would be enough of a risk to justify any serious research on FAI.

In my opinion this just shows that the people who think that way don't know what they are actually doing. They don't understand their own AI. I mean why would any AI destroy the world? I don't think that any AI no matter how "evolved" it is would ever come to such a conclusion. Especially not if there are some mechanisms that restrain it from doing so or even thinking in that direction.

So in my opinion an AI or any other program does only what it was "told"/programmed to do. So you would have to actually tell the AI to destroy the world.Besides it's not that the AI's development wouldn't be supervised by humans if there would be any problems I guess that they would become obvious pretty fast(especially since every thought will be logged).

No, people - especially those SIAI guys -seem to forget that for any action there needs to be some motivation. And I somehow doubt that anyone would put such a malevolent motivation into the code or design a framework that would allow for such a motivation to develop.

I even think this topic is not even worth being addressed, it's just some pseudo problem that's not even worth mentioning.

But I am also not convinced that your approach to AGI is the right one(forgive me for that) :-PI don't think that creating an AI that evolves like a human (toddler->child->teenage->grownup->scientist->superAI) is the right way for AI. I think it's more about finding an "intelligence algorithm" some algorithm that could be applied to any sort of problem to solve it. meaning that the AI would be super clever from the start and it would become smarter only in terms of amount of knowledge it requires and the research it does with that knowledge.

I don't think that creating an AI that evolves like a human (toddler->child->teenage->grownup->scientist->superAI) is the right way for AI.

By the way, I certainly don't think that loosely mirroring human cognitive development is the only path to advanced AGI.

It's just the only path that I currently understand extremely well, both in terms of cognitive aspects and in terms of ethical aspects.

I think it's more about finding an "intelligence algorithm" some algorithm that could be applied to any sort of problem to solve it. meaning that the AI would be super clever from the start and it would become smarter only in terms of amount of knowledge it requires and the research it does with that knowledge.

I think that human intelligence (and OpenCog for that matter) contain many different general-purpose "intelligence algorithms." Making general-purpose intelligence algorithms is actually very easy given current mathematical and CS knowledge -- if one isn't concerned about computational resource usage.

The problem is that each general-purpose intelligence algorithm is biased in different ways, and is faster at learning some things and slower at learning others.

What a real-world intelligence needs is a learning algorithm (or collection thereof) that is appropriately biased to the class of environments and problems it has to deal with, so as to deliver results in feasible computation times.

Two ways to do this seem to be

1) appropriately formalize and understand the relevant biases for human-relevant environments and problems, and use these to guide AGI design

I understand. To me it seems the first way is the best because it seeks to find the most optimal solution. If you take the human brain as a reference I could imagine that a lot of unnecessary computations are done by cognition processes in the brain.

But I guess it's still good to analyze how it works and then optimize it.

You said that each "intelligence algorithm" is biased in some way. Can you give a certain example?

You said that each "intelligence algorithm" is biased in some way. Can you give a certain example?

Hutter's AIXI is explicitly biased based on the underlying Universal Turing Machine.

The human brain is biased in many ways for recognizing patterns in a world defined mainly by solid objects rather than e.g. fluids, as I discussed in

http://journalofcosmology.com/SearchForLife115.html

We are also biased to be able to effectively solve certain kinds of spatial reasoning problems (which occur in everyday life, e.g. navigation) rather than other kinds of reasoning problems that don't seem harder according to our abstract mathematical formulations of them. This is because a lot of the brain is hard-wired for efficient representation and manipulation of spatial information.

There are many, many other examples as well. See Eric Baum's "What Is Thought?" for lengthy and deep discussions on this theme.

Also see Mark Changizi's "The Vision Revolution," for a detailed discussion of the way that environmental biases have impacted the human visual system.

I'd just like to note, at least for now, that I have read and considered both Yudkowsky's writings and a fair chunk of Ben's work (including his very interesting book The Hidden Pattern), and paid attention to various debates on SL4 in the good old days, and I must admit, I find SIAI's position far more cogent than Ben's.

Unfortunately I really don't have the time right now for an extended debate about varius points mentioned here - and honestly, I'm not really optimistic about building bridges across as wide differences in perspectives as these - so I'll just point out this passage, from the Value is Fragile post in case you missed it:

"Consider the incredibly important human value of "boredom" - our desire not to do "the same thing" over and over and over again. You can imagine a mind that contained almost the whole specification of human value, almost all the morals and metamorals, but left out just this one thing -

- and so it spent until the end of time, and until the farthest reaches of its light cone, replaying a single highly optimized experience, over and over and over again."

Do you agree with this, and furthermore, doesn't this seem to argue in the direction of the fragility of value thesis?

Oh, and one of my reactions on you post is nearly the same as can be found here: http://lesswrong.com/lw/2zj/value_deathism/

You don't understand why you're failing at creating Strong AI - the answer is right in front of you:

You need to create a game where the only solution to avoid unhappiness/eventual death is to create another world/strong AI that will help the original created AI avoid this fate itself.

Use patterns to relate to "happiness" and try to get it to 10 or 100 or whatever number but make it so everything is useless and eventually makes you unhappy except doing things that lead to programming strong AI

you can EXPLICITLY teach the AI about the possibility of creating AI but obviously tell him that its failed so far. DO NOT make it obvious that you are only talking to him ---- use PATTERN RECOGNITION to make him reach the conclusion that the only thing he can do to make himself 100% happy is create AI himself that can help the rest of the people in his world

That is the end goal. Right now you're failing because you don't understand the end goal. If you did you would have achieved it already.

Someone asked, pertinent to the thought-experiment of a mind that has all human values except for the human aversion to boredom:

... doesn't this seem to argue in the direction of the fragility of value thesis?

Of course that depends on how you define the "fragility of value thesis."

The "fragility of value" thesis is not precisely defined, and I guess if it were precisely defined, it would seem a lot less impressive.

The only versions of the thesis that your example effectively demonstrates, are the obvious, boring versions with no practical consequences.

If all the thesis means is that there are value-sets with similar descriptions to human values that, if maintained for a long time, would lead to drastically different behaviors -- yeah, that's obvious, but so what?

If the thesis means that intelligent systems with human values are fragile in the sense that a little bit of self-modification is very likely to change their value system in a way that will radically change their behavior -- this is not demonstrated nor even argued for in any of the SIAI literature I've read.

I find it quite ironic (but not remotely surprising) that some members of a community apparently very serious about rationalism (LessWrong), pursue such weak arguments about topics apparently important to them.

What you have argued is, basically: that there are SOME value systems that are similar to the human value system in terms of their (say, textual or genomic) description, but very different in terms of their ultimate practical manifestation.

This is very different from showing that, IF you take an intelligent system with human values and mutate that system's values, the result will *likely* be a system with resultant behaviors drastically different from those resulting from human values.

For one thing, you haven't shown that MOST value systems that have similar descriptions to the human value system will lead to drastically different resultant behaviors. You've just made the obvious point that SOME will do so.

For another thing, you're ignoring the point that human-like minds have an autopoietic property, so that when you remove one part of them, something else similar to that part tends to come back in its place.

So if you cut out the desire to avoid boredom from a human or human-like mind but left the rest the same, quite possibly something similar the desire to avoid boredom would regenerate out of the mind's own autopoietic dynamics. Bear in mind that human-like minds are not rigorously top-down goal-driven but are largely guided by bottom-up self-organizing dynamics.

In short, I find that "argument" totally unconvincing, and I'm utterly befuddled that any intelligent person could consider it a convincing argument. Surely *that* argument about boredom can't be your reason for believing that AGI systems with human values would be likely to drift away from them over time???!!!

Ben wrote:Surely *that* argument about boredom can't be your reason for believing that AGI systems with human values would be likely to drift away from them over time???!!!

Indeed, it isn't. :)

(Actually, I believe quite the opposite: a sufficiently intelligent AGI system with human values would *not* drift away from them - at the very least, it would do everything in it's power to evade this outcome. The problem we're facing here is, to a first approximation, ensuring that the first AGI precisely shares its values with us in the first place. Value drift comes into play only later, when considering recursive self-improvement of the system.)

What the boredom example shows is more or less what you said: the future optimized by a goal system that would contain the whole of our morality *except* our sense of boredom, would be almost devoid of value. But this doesn't hold just for boredom - there's another example in the Value is Fragile post, and then we have the Fun Theory sequence (http://lesswrong.com/lw/xy/the_fun_theory_sequence/) that contains a discussion of many aspects/dimensions of our value-system, and it isn't that hard to see that if only one of these dimensions would be absent (or different) in the goal system of the AGI, the same conslusion would follow.

Another perspective on this issue is modeling our values simply as a computer program that one can ask what the optimal world looks like and it spits out a detailed description of that world. In general I would expect that changing only a tiny little bit of a program would likely entail a big change in this description. It may be that our morality is highly resistant to a large class of changes like these, but from the evidence we have (i.e. thinking about what would be the consequences of neglecting or changing things like boredom or any of the other dimensions of our values for the future world) it seems unlikely.

I am aware of self-healing (autopoietic) property of our minds, but it isn't at all clear in what way it holds for our value system (as opposed to our models of the world - not that these can in practice be cleanly separated, but nonetheless); it's also questionable how it would manifest in our AGIs, even though they are being constructed "in our own image" (e.g. OpenCog does seem to have a high-level architecture similar to human mind, but is this enough? It's low level processes are clearly very different than those in our brains...); there's also a problem regarding precision: if a human specific sense of boredom has been left out and something different - but "boredom-ish" nevertheless, grows back - the future could still end up being far from optimal; and then there's of course a question of how strong this self-healing power really is - if we forged to include a large part of our values, would they still grow back? It seems unlikely, and since our morality is complex (as you seem to agree) we have quite a challenge in our hands...

So, boredom was just a specific example that clearly isn't enough to drive the conclusion about fragility of our values home. But it nonetheless does move me a bit in that direction, and by considering other examples and other lines of reasoning the thesis really does seem to become pretty likely...

Ben wrote:For one thing, you haven't shown that MOST value systems that have similar descriptions to the human value system will lead to drastically different resultant behaviors. You've just made the obvious point that SOME will do so.

If one picks at random a dimension of our morality, changes it and thinks about the world this slightly different goals system would lead to, and sees how far from optimal it would be - this would show that indeed the above conclusion would hold for most of these value-systems, as a priori, one would expect. Boredom, absence of the idea of external referents (or its converse), our specific need for complex novelty and high challenge etc.... perturb any one of these and the future shatters. As I said before, this evidence isn't really conclusive, but it isn't very far from being this either.

Ultimately, if an AI needs to act cooperatively with other AI's, in all likelihood, will adopt similar behaviors to those of animals. Thinking about it in terms of evolution, AI's that can't cooperate will be doomed in the long run. <a href="http://www.ted.com/talks/robert_sapolsky_the_uniqueness_of_humans.html>Here</a> is an interesting talk by a Stanford Neurobiologist (it doesn't much scientific content in it, but I think it could be used as an interesting example of how AIs could look).

SIAI is not consistent with the way observable intelligent beings work. Ultimately all the logical arguments you come up with a gross toy model generalizations. Without an argument that includes the entirety of the relevant environment, predicting the behavior is a futile effort. An AI is not the sum of its parts but how it interacts with the rest of the environment. Perhaps AI's will not have respect for humans in the same way we don't have respect for the other animals. This does not mean we go out of our way to eliminate them, or indeed through our activities necessarily passively destroy them. In recent times, our increasing impact has caused worry about "sustainability" on a global scale (including the impact on animals). Marginalization of humans is a possibility but one of the most likely if we don't do anything to keep ourselves competitive with AI's. I'm thinking the SIAI scenario can be possible but only if one made a monolithic AI that did not need to cooperate with other intelligences. Unassailable monopoly positions of companies/governments seem like a good example of this outcome.

I must say that I'm not a strong proponent of the the Singularity (because I don't understand it :P) but clearly have mental wanderings in that direction. My personal fear is a breakdown in the distribution of labor, without reasonable specialization we can't be more than the sum of our brains, nor can we sustain the population of the world. I suspect a network of AI's would be very similar in the way computational labor would be distributed. In this regard, I find AGI an interesting concept because I'm not certain how general you can make an AI. As the AGI ran on certain tasks it would become better at them, in effect adding specialization to the intelligence. I would support any AI that was capable of understanding and optimizing these butterfly effect relationships that people have because there is (as I see it) no other way we're going to avoid a collapse of the human race and ecosystem.

Ultimately, if an AI needs to act cooperatively with other AI's, in all likelihood, will adopt similar behaviors to those of animals. Thinking about it in terms of evolution, AI's that can't cooperate will be doomed in the long run. <a href="http://www.ted.com/talks/robert_sapolsky_the_uniqueness_of_humans.html>Here</a> is an interesting talk by a Stanford Neurobiologist (it doesn't much scientific content in it, but I think it could be used as an interesting example of how AIs could look).

An AI is not the sum of its parts but how it interacts with the rest of the environment. Perhaps AI's will not have respect for humans in the same way we don't have respect for the other animals. This does not mean we go out of our way to eliminate them, or indeed through our activities necessarily passively destroy them. In recent times, our increasing impact has caused worry about "sustainability" on a global scale (including the impact on animals). Marginalization of humans is a possibility but one of the most likely if we don't do anything to keep ourselves competitive with AI's. I'm thinking the SIAI scenario can be possible but only if one made a monolithic AI that did not need to cooperate with other intelligences. Unassailable monopoly positions of companies/governments seem like a good example of this outcome.

I must say that I'm not a strong proponent of the the Singularity (because I don't understand it :P) but clearly have mental wanderings in that direction. My personal fear is a breakdown in the distribution of labor, without reasonable specialization we can't be more than the sum of our brains, nor can we sustain the population of the world. I suspect a network of AI's would be very similar in the way computational labor would be distributed. In this regard, I find AGI an interesting concept because I'm not certain how general you can make an AI. As the AGI ran on certain tasks it would become better at them, in effect adding specialization to the intelligence. I would support any AI that was capable of understanding and optimizing these butterfly effect relationships that people have because there is (as I see it) no other way we're going to avoid a collapse of the human race and ecosystem.

Ultimately, if an AI needs to act cooperatively with other AI's, in all likelihood, will adopt similar behaviors to those of animals. Thinking about it in terms of evolution, AI's that can't cooperate will be doomed in the long run. <a href="http://www.ted.com/talks/robert_sapolsky_the_uniqueness_of_humans.html>Here</a> is an interesting talk by a Stanford Neurobiologist (it doesn't much scientific content in it, but I think it could be used as an interesting example of how AIs could look).

An AI is not the sum of its parts but how it interacts with the rest of the environment. Perhaps AI's will not have respect for humans in the same way we don't have respect for the other animals. This does not mean we go out of our way to eliminate them, or indeed through our activities necessarily passively destroy them. In recent times, our increasing impact has caused worry about "sustainability" on a global scale (including the impact on animals). Marginalization of humans is a possibility but one of the most likely if we don't do anything to keep ourselves competitive with AI's. I'm thinking the SIAI scenario can be possible but only if one made a monolithic AI that did not need to cooperate with other intelligences. Unassailable monopoly positions of companies/governments seem like a good example of this outcome.

My personal fear is a breakdown in the distribution of labor, without reasonable specialization we can't be more than the sum of our brains, nor can we sustain the population of the world. I suspect a network of AI's would be very similar in the way computational labor would be distributed. In this regard, I find AGI an interesting concept because I'm not certain how general you can make an AI. As the AGI ran on certain tasks it would become better at them, in effect adding specialization to the intelligence. I would support any AI that was capable of understanding and optimizing these butterfly effect relationships that people have because there is (as I see it) no other way we're going to avoid a collapse of the human race and ecosystem.

After that, I gradually came to the conclusion that:a. Kurzweil is indeed a nut making unwarranted claims.b. The SIAI people are fully committed to finding out 'how many angels dance on the head of a pin.' c. Your one of a tiny handful of people in the singularity and even h+ movements whom is even somewhat interested in providing evidence for claims and also doing actual work.d. I look forward to your next book, because while I'm not yet convinced cog can work within 10 years, I haven't completely given up on that idea either.e. Luck is not the reason people become wealthy, it's fanatically giving the customer what they want.f. Academia can influence ones thinking away from a focus on tangible results, and that has become a trap for many.h. Needing to see an understandable working prototype of some fashion is not at all an unreasonable or unusual request before investing 25 million.i. Investors consider the mindset of those they invest in, much more so than even a prototype. Associating ones self with a group of people or ideas whom are strongly associated with hopeful/wishful thinking is not a positive in such mindset observations. Being fanatical about providing the customer with a working useful product is.

btw-The prototype for the Manhattan project was Fermi's graphite pile reactor. It was easy to imagine from that a reactor that would have a run away reaction and explode. If you could make something like a robotic worm or fish that can unambiguously live the full life of an autonomous real worm, than you just may have your graphite pile.

> I've had had two separate (explicitly) SIAI-inspired people> tell me in the past that "If you seem to be getting too far> with your AGI work, someone may have to kill you to avert> existential risk." Details were then explained, regarding> how this could be arranged. . .>> It does seem plausible to me that, if Scary Idea type rhetoric> were amplified further and became more public, it could actually> lead to violence. . .>> So personally I find it important to make very clear to the> public that there IS no carefully backed up argument for the> Scary Idea anywhere.

Interesting to see you willing to adopt this linepublicly. (And isn't it ironic that the self-proclaimed"ethical geniuses" of SIAI itself don't seem to be aware ofimplications such as this?)

Six years ago, one M__ wrote on the SL4 mailinglist "I sincerely hope that we can. . .stop Ben Goertzel and his army of evil clones (I meanemergence-advocating AI researchers :) andengineer the apothe[o]sis."

(Yes, I saw the colon turning the closing paren into asmiley, and no I was not reassured by it.)

1. The [SIAIans] will actuallymorph into a bastion of anti-technology. The approachesto AI that -- in my non-expert and in other folks'rather-more-expert opinions -- are likeliest to succeed(evolutionary, selectionist, emergent) are franticallydemonized as too dangerous to pursue. The most**plausible** approaches to AI are to be regulatedthe way plutonium and anthrax are regulated today, orat least shouted down among politically-correctSingularitarians. IOW, the [SIAI] arrogatesto itself a role as a sort of proto-Turing Police outof William Gibson. Move over, Bill Joy!

2. The **approved** approach to AI -- a[n SIAI]-sanctioned"guaranteed Friendly", "socially responsible" framework(that seems to be based, in so far as it's coherent at all,on a Good-Old-Fashioned mechanistic AI faith in"goals" -- as if we were programming an expert systemin OPS5), which some (more sophisticated?) folks have alreadygiven up on as a dead end and waste of time, is to suck up allof the money and brainpower that the SL4 "attractor" canpull in -- for the sake of the human race's safenegotiation of the Singularity.

3. Inevitably, there will be heretics and schisms in theChurch of the Singularity. The Pope of Friendliness willnot yield his throne willingly, and the emergence of someone(M__?) bright enough and crazy enoughto become a plausible successor will **undoubtedly**result in quarrels over the technical fine points ofFriendliness that will escalate into religious wars.

4. In the **absolute worst case** scenario I can imagine,a genuine lunatic FAI-ite will take up the Unabomber'stactics, sending packages like the one David Gelerntergot in the mail to folks deemed "dangerous" accordingto (lack of) adherence to the principles and politics of FAI(whatever they happen to be according to the reigningPope of the moment). I know some folks who'dbetter watch out! :-> :-/

@AnonymousRegarding Kurzweil, you should perhaps also read his response on that post:http://www.kurzweilai.net/ray-kurzweil-responds-to-ray-kurzweil-does-not-understand-the-brain

But anyways, his position is in fact rather different from SIAI's and thus isn't really on-topic here.

"c. Your one of a tiny handful of people in the singularity and even h+ movements whom is even somewhat interested in providing evidence for claims and also doing actual work."

Yeah well, only real code counts, doesn't it?! Forget about all the math, existential risks analysis and other theory-building endeavours that might one day lead us to a formal theory of intelligence. :P

@Anyeon I read Kurzweil's post very carefully, and thought about it and his work for weeks. It took me that long to break out of the fog of Kurzweilien pseudoscience that I had bought into for 3 years.

RK: It is true that the brain gains a great deal of information by interacting with its environment – it is an adaptive learning system. But we should not confuse the information that is learned with the innate design of the brain. The question we are trying to address is: what is the complexity of this system (that we call the brain) that makes it capable of self-organizing and learning from its environment? The original source of that design is the genome (plus a small amount of information from the epigenetic machinery), so we can gain an estimate of the amount of information in this way.

Kurzeweil didn't understand Myer's main point that the brain's 'design' is actively co-created by both dna and the bio environment it develops in. The human brain develops in the womb driven by dna and numerous developmental biological factors such as rates of cell growth, hormones,etc,etc. That ridiculously complex process gradually produces an organ that becomes capable of the adaptive learning process that RK refers to above- oh yea, except we still mostly don't undestand how it works even after development. As Meyers put it, the DNA is the cpu instruction set, the bio environment is the much larger in size software application. RK either doesn't understand this point or is purposely avoiding it in his response.

RK goes on to wave his hands and say 'accelerating technology will handle it'-exactly how can he claim that? Have the magic words 'accelerating technology' made it unnecessary to provide evidence for claims about the future? Reread 'the singularity is near' with a skeptically inquiring eye and see if it doesn't look very different to you.

SIAI indulges in similar pseudoscience by claiming it can talk rationally about what happens to an ai after it is developed, when right now it's far from clear that making an ai is even possible, let alone what it's characteristics would be. When I found SIAI I first assumed they would be busy trying to actually help build an ai, I assumed wrong. Whatever they are doing it clearly has nothing to do with aiding the actual creation of new automated thinking technology. H+ people should stop giving SIAI and RK a free pass on claims made without evidence, and the idea that their endless philosophizing is doing anything to actually aid the creation of human longevity/enhancement. SU doesn't count either, because it teaches the pseudo scientific, initiative discouraging idea of 'accelerating technology.'

re: Yeah well, only real code counts, doesn't it?! Forget about all the math, existential risks analysis and other theory-building endeavours that might one day lead us to a formal theory of intelligence. :P

-If you can explain to me how risk analysis will lead to a formal theory of intelligence I would be glad to listen. And who in the h+ community is doing what peer reviewed math on this topic?

At what point of lack of evidence, skeptical review and technology creation does the h+ community go from actual life extending/enhancing work to a discussion group of wishful pleasant fantasies?

What can I say... I've never been a great fan of Kurzweil, for the practically the same "accelerating technology will save us all" reason you mentioned. But the guy still has some interesting ideas and I still think that he's more or less correct in what he said in the first paragraph you quoted. Sure, DNA and epigenetic machinery doesn't work in a vacuum, but neither does the environment contain that much complexity in comparison to the DNA. People are developing artificial organs even today and it doesn't seem that hard to create an environment where they can grow from the stem cells - almost all the "magic" is contained in the cells themselves...

Anyway, I'm not here to argue about Kurzweil's ideas and actually, neither am I here to defend everything SIAI, Future of Humanity Institute, or other parts of H+ movement, for that matter, are doing. I'm just waiting for a reply from Ben on the specifics of the fragility of value thesis, due to its central place in the Scary Idea, which I do hold as probably correct.

If you want to see more clearly SIAI's position and why it's far from pseudoscientific, you can go through the Less Wrong Sequences and perhaps other SIAI writings, if you haven't already (yeah, I know, appealing to the Sequences again, but they really do help in explaining a lot of things about what SIAI is currently doing, which cannot be succintly summarized; also, if you'd like to think more clearly about (F)AI, they are indispensable).

If you can explain to me how risk analysis will lead to a formal theory of intelligence I would be glad to listen.

Well, it won't, but it's nonetheless very important to understand the risks of AI and x-risks in general, so we can better aim our endeavours in a positive direction. As far as formal theory of intelligence is concerned, it surely won't come only through trying to code an AI - we also need a fair amount of mathematical and philosophical type of work here. Sorry if this wasn't clear from what I've written above (nonnative speaker).

And who in the h+ community is doing what peer reviewed math on this topic?

It seems SIAI is working into that direction (http://singinst.org/research/publications/), and regarding x-risks, there's of course Nick Bostrom's work... But yeah, I would also like to see much more being done in this vein - and so I'm in a process of studying various things, so I can one day actually contribute something. If instead of complaining, a few other capable people would help to push in a way of positive singularity, things would surely move faster... Care to help a bit? :)

Thanks for you replies. Yes I am working on working code everyday that will directly advance human longevity/enhancement. However I have been reluctant to 'come out' as an h+ motivated technologist because I'm concerned that would hurt my effective productivity in various ways. The 'singularity' and RK have become religious like icons for the h+ crowd to the point that it seems like it's not a benefit to admit to h+ motivations-for a few different reasons.

I believe the 'singularity' idea is doing more harm than good in advancing h+ tech, because it's influencing people to focus on philosophizing instead of tech creation, and the fact that the real material future of h+ tech is unknown and will Only be determined by those who create it by finding solutions to the real material world tech challenges.

Theologists wrote endlessly in the middle ages about how to attain 'salvation' in the face of unchangeable human poverty and misery. Than the industrial revolution happened because of people like James Watt, the inventor of the usable steam engine. Yes, like yourself, I want to be James Watt, not a theologian like SIAI and RK.

I believe the 'singularity' idea is doing more harm than good in advancing h+ tech, because it's influencing people to focus on philosophizing instead of tech creation, and the fact that the real material future of h+ tech is unknown and will Only be determined by those who create it by finding solutions to the real material world tech challenges.

Yes, this is a key point.

The idea that "a positive Singularity is inevitable" is also a potentially dangerous one -- though in a different way than the Scary Idea.

The Scary Idea is potentially dangerous because, even though it's unlikely to spread beyond a small group, there's the potential that some members of this small group could become too fanatical and cause real harm to people working toward a positive Singularity and other beneficial outcomes.

The "Positive Singularity is Inevitable" idea is potentially dangerous because it has the potential to de-motivate people from actually working toward a positive Singularity.

Actually there's a lot of uncertainty about our future, as you note. And what we, individually and collectively, do will guide the outcome.

If one picks at random a dimension of our morality, changes it and thinks about the world this slightly different goals system would lead to, and sees how far from optimal it would be - this would show that indeed the above conclusion would hold for most of these value-systems, as a priori, one would expect.

I don't know if that's true or not, but I don't see why it matters -- it's just a theoretical-philosophy dispute anyway.

If we create AGIs with roughly human-like value systems and teach them and work and co-evolve side by side with them -- then we have a situation to which these arguments about random mutations to value systems are pretty dubiously relevant.

By your same argument, the ancient Chinese value system should be totally gone now. Because one could argue that any random change to it would destroy it. But in fact it has been changed in many ways, but it's adapted and transmogrified rather than goten destroyed. Because, among other reasons, the changes that have occurred have NOT been random.

Give me a rigorous argument (not necessarily a math argument, I'd also be interested in one with the rigor level of, say, Spinoza's verbal arguments) that AIs created with roughly humanlike value systems and then productively integrated into human society, will have value systems that drift from the human one in some sudden and dangerous way -- then I'll be interested.

But all these arguments about random minds, or random mutations to value systems, seem pretty much like a bunch of "angels dancing on the head of a pin" type blah-blah-blah to me, to be perfectly frank....

It's not that I only value software engineering -- I'm a mathematician originally, and I greatly value and enjoy philosophy, including analytical, Eastern, Contintental, etc. etc. .... It's just that your arguments don't convince me at all, on any level; they seem to be addressing theoretic questions quite different than the real ones that our real lives and technologies will be coming up against. But then you seem to be trying to draw practical conclusions from these half-baked, marginally-relevant theoretic arguments...

re:Sure, DNA and epigenetic machinery doesn't work in a vacuum, but neither does the environment contain that much complexity in comparison to the DNA. People are developing artificial organs even today and it doesn't seem that hard to create an environment where they can grow from the stem cells - almost all the "magic" is contained in the cells themselves...

-I can understand that it would appear to look that way on the surface. Yet Dr Myers, a phd who has been studying the brain for years, says it's not the case that the cumulative complexity of the process of development has less complexity than dna, and also provided examples on why that's so.

The artificial organs created so far are more like lumps of cells rather than the intricate chemical,electrical and differentiated cell structures that make up the brain.

As far as anyone knows right now, we will have to understand that complexity to have enough understanding of the brain to reverse engineer it.

RK showed his ignorance on the true complexity level of the brain, Dr Myers called him on it, and RK is playing public relations as usual in response.

A large number in the h+ movement are using RK's book as an intellectual basis for h+ 'work', including ai. That's a mistake because his book is based on wishful thinking and unwarranted claims, not the systematic doubt of science or proven calculations of engineering.

h+ needs a new leader and a new systematic doubt and real engineering based guideline book.

>> And who in the h+ community is doing what peer reviewed math on this topic?

It's not quite math but I'm doing hard-core peer-reviewed philosophy of Friendliness at AGI and BICA conferences (and Ben can attest that I actually have the experience to program towards an AGI -- except that I believe there's a reasonable chance that we're all dead if no one works on Friendliness).

I'm in the middle of transcribing my presentation from BICA '10 but the powerpoint is available here and the transcript will be available at my blog before the end of the month. I greatly appreciate any and all criticisms.

My biggest objection to the SIAI and LessWrong is that they *APPEAR* to be saying that a bad singularity is virtually inevitable and that they are the only ones capable of solving the problem. That is not their intent but it is the way that it is coming across (and their arrogance and the threats that their behavior generates from idiots are very real problems).

Mark, I consider myself part of LW/SIAI crowd and I fully support other approaches to AI friendliness (such as your work) insofar as they don't generate additional risks (and your research surely doesn't). We really need to explore this territory further to get a nonnegligible chance of success, and other travelers are always welcome. :)

A large number in the h+ movement are using RK's book as an intellectual basis for h+ 'work', including ai. That's a mistake because his book is based on wishful thinking and unwarranted claims, not the systematic doubt of science or proven calculations of engineering.

I think you are exaggerating a bit, but still, there is a grain of truth in what you're saying and that's one of the reasons I try to nudge people into reading LessWrong, so they can one day rise above Kurzweil's level...

Ben,thanks for responding, but I don't think I'm going to waste any more of your (and my) time here...

Deep differences in various assumptions are hidden behind these arguments and it seems unlikely that we could reach any kind of agreement through these comments. A way longer treatment would be needed, Eliezer has written one on LessWrong, but he never explicitly summarized all the things he should have, so... perhaps I, or someone else, might one day write a suitably long article linking all the ideas tightly together, and we may then resume this conversation.

Or perhaps I'll change my mind and become an OpenCog developer, although I really doubt it. :)

What could be done about IDIOTS voicing threats to AGI developers... I don't know - maybe repeating time and time again that they're idiots? It might help. Just a little bit. :/

Dear Ben, there are no mental function at all, consciousness included.To avoid the risk connected with an Artificial Reasonable Systems one should design it so that the system will be unable to generate its own wishes. By that the system will be used as our tools. Artificial Reasonable Systems unavoidably will eliminate the mankind, if system will designed with ability to generate its own wishes, due to limitations of the available resources.Actually the name Artificial Reasonable Systems, and Artificial Intelligence as well, is misleading.The right name should mention the feature of subjectivity as united for all systems capable to demonstrate the reasonable behavior.Best regards, Michael

I would like to suggest a strategy to help minimize the risk of AGI-caused apocalypse.

One strategy you mentioned is "building an AGI according to an ethics-focused architecture." That may be powerful, but it is abstract and difficult for outsiders to visualize. Perhaps your critics would be less hostile if you pledged to use an additional strategy that is simpler to understand.

That strategy is quarantine.

In your October 10 blog, Ringo-ring commented that OpenCog is building an AGI that lives in a virtual world separate from the real world. This is one type of quarantine, but not the only possible one.

AGIs in the movies are able to unleash apocalypse because they control weapons. The obvious preventive measure: Don't give the AGI control of weapons, or anything dangerous, including construction of other AGIs.

You could let the AGI design its next generation, and other potentially dangerous things. But the construction should be controlled by humans (perhaps using narrow AI), under quarantine conditions.

Even a superintelligent AGI must obey the laws of physics. It can't unleash a bio-weapon or nano-weapon or ICBMs without access to a bio-lab or nano-lab or machine shop. So give the AGI free reign in its virtual world, but only limited access to ours.

Would this make it a slave? Yes and no. Confinement is aversive to humans, but need not be so to an AGI. If properly designed for friendliness, the AGI would not suffer in quarantine like a human slave.

When thinking of apocalypse, I'm more concerned about unfriendly humans gaining control of AGI technology. What is OpenCog doing to prevent that?

About quarantine: A serious issue with this idea is that the "global brain" -- i.e. all the knowledge on the Internet, and the people and software programs interacting with it -- is a major accelerator of artificial intelligence. So a quarantined AGI would probably be far stupider than a non-quarantined AGI with the same computing resources and similar cognitive algorithms. So, this only works if some group makes an AGI using methods nobody else has any clue about, and then secretly develops it in the quarantined lab...

About quarantine: A serious issue with this idea is that the "global brain" -- i.e. all the knowledge on the Internet, and the people and software programs interacting with it -- is a major accelerator of artificial intelligence. So a quarantined AGI would probably be far stupider than a non-quarantined AGI with the same computing resources and similar cognitive algorithms.

This issue is not serious if you put your creative mind to it. I hope you can generalize the concept of medical quarantine to a broader and more useful concept of "limited contact with dangerous resources."

For example: Want the AGI to explore the Internet? Then copy the Internet into offline storage and let AGI explore that. Or let it explore the live Internet with monitors/filters on its interaction (like "parental controls" for your toddler).

My point is there is middle ground between (1) preventing AGI development and (2) unleashing wild AGIs on the world. There is a third way, which addresses the fears of your critics.

So, this only works if some group makes an AGI using methods nobody else has any clue about, and then secretly develops it in the quarantined lab...

I don't understand this leap of logic. Your quarantine program could be non-secret, and probably should be to allay fears.

Also, my last question was not rhetorical. I'm worried about the issue (which you have articulated well) of nasty people with AGI.

"Copy the Internet into offline storage" sounds to be beyond the financial resources of any realistic AGI project.... Also, you can't copy all the people using the Net into offline storage, but an AGI making use of continual real-time interactions with these people will surely learn faster and become smarter...

As for stopping nasty folks from using AGI, obviously the only solutions are to keep the AGI secret and choose the secret-keepers really well, or to make it public and make sure the good guys develop their version faster... right? Do you see another route?

It sounds like you are dismissing my comments as fast as possible without thinking on them too much. That saddens me a bit, since I'm a fan of your thinking.

As for stopping nasty folks from using AGI, obviously the only solutions are to keep the AGI secret and choose the secret-keepers really well, or to make it public and make sure the good guys develop their version faster... right? Do you see another route?

The AGI need not be secret, just the plans for building it... as the plans for nuclear weapons are kept off the Internet. Choosing secret-keepers can never be foolproof, but some careful effort is better than nothing.

If your work produces an AGI Sputnik, I suspect the US government will step in to secure the design. I hope you won't wait for that event to start being careful with it.

Sorry if my replies seem hasty; it's true I don't like to spend too high a percentage of my time on online communications... there's work to be done.

Joel Pitt and I are in the midst of writing a paper on AGI ethics, which addresses a lot of these issues pretty carefully. I'll post a link to that on this blog, when it's ready!

About the US gov't securing our AGI code or designs, that won't happen because the design will be in a published book and the code will be in a public OSS code repository with backups all over the world. Thus, the "nasty people fork the code and build bad stuff with it" risk exists in our case, but not really the "government takes the project secret" risk...

We are not ignoring the nasty-people risk, but indeed, we are taking that risk -- as is everyone else in the science and engineering community who publishes work or commercially releases it. Joel and I will discuss this in our soon-forthcoming paper, carefully but certainly not conclusively.

"As for stopping nasty folks from using AGI, obviously the only solutions are to keep the AGI secret and choose the secret-keepers really well, or to make it public and make sure the good guys develop their version faster... right? Do you see another route?"

I don't. I note that the first route looks indistinguishable from the outside to the strategy that nasty folk are most likely to use. Best not to help such organistions, I figure.

Your approach seems eminently sensible. The SIAI's approach is not only scary, but also, stupid. If an AI is built that turns out to be less than beneficial -i.e.homophobic - then surely, among the 7 Billion inhabitants of Earth, there will be enough Luddites to ensure that the AI is killed or deactivated in some other manner. All mechanistic systems use energy in some form or another; therefore, if all power systems that feed the AI are linked through a "dead man's switch" as used in Amtrack and other railway systems, that would ensure the AI could not survive. If the AI is enabled to protect its power supplies, that's a dangerous step. As for the SIAI's concept of something that has an IQ of 333 (grin) that merely means (within their understanding ) that the AI is incredibly fast. It does not imply that it is truly intelligent, nor that it has genius - if it attempts to eliminate the human race, then unless it can transcend Space-Time and the speed of Light, with with whom,or what will it communicate ? Would it not die then of self boredom?

your concept of teaching it - or implanting into it a system of Ethics is right, but remember what Asimov taught us - that you cannot makeAs such,I fear that members of the SIAI have allowed themselves to be swayed by too much scifi, together with the horror stories attached thereto. The only things thatcould kill us all, are either catclysmic (asteroids, comets,rogue stars etc.,) or of our own making,by destroying life that produces oxygen (forests etc.,but more importantly, sea plankton )

Your plan to instil / install some system of ethics is eminently needed, but will not be simplistic at all - as Asimov showed us so many years ago.

We maybe can extrapolate from our own behaviors toward less intelligent species how a superintelligent AGI would act toward us. Generally Humans want to keep all other species around, but that does not stop many Humans from destroying the environment of those species, using them as raw materials in factories, as food products whenever its cheaper than creating the bio materials in a lab, keeping some of them in cages for our amusement, and so on. But overall Humans try to keep at least some of the other species existing, so they can breed more, and many Humans are against those actions against the other species, so a balance forms. Sometimes that balance destroys many species, and sometimes it reacts fast enough that only half of the species dies. Either way, its not looking good for Humans if a superintelligent AGI shows Humans as much respect as Humans show other species, but its unlikely we would go extinct from it, maybe half of us would die, but as society is organized today that probably would have happened anyways or worse.

SIAI is a good idea but a bad example of egocentric, narrow minded, unworldly men. Eli is mentally unfit to engagae the world without emotional terror. How can we rely on someone who has so many emotional problems to think logically about society, values, and especially AI/AGI?

Dear Ben,IMHO "SIAI's scary idea" is based on a misunderstanding about what AGI means.

AGI and AL "Artificial Life" - is confused.

Intelligence is the ability to achieve objectives.Life consists of living beings who have learned to survive.

Maybe there is a need to distinguish more exact between this two topics even for those who work on an AGI to center themselves on the real task again and again...

Living beings can be more or less intelligent.

If there would be an artificial living being with an own desire to survive and which would have AGI at it's disposal, then "SIAI's scary idea" could become true - but not necessarily...

I think that we are able to an AGI. But I am also convinced that we are not able to program artificial creatures that can survive independently in the network.

A complex virus, containing AGI would not be enough!!!!

But it could evolve by itself! Basically every growing business contains a big part of the abilities of such a being. Today humans are still involved in the structures of it but with more and more automation the humans will vanish...

My opinion is that the meaning of thinking about "SIAI's scary idea" is to learn to recognize when such a being is forming.

Are the Google-bots all together already forming such a being?Or the Facebook-bots? Or others?

About which behaviors we should become conscious to avoid that this structures mutate into scary singularities?????

The existence of any being is always based on a feedback of information that is establishing itself. Just in this moment, when building information of a structure is feed back into itself it comes to existence and, by the way, the awareness of existing is established in the same moment by the same means. Existing and being aware of existing is the same because the feedback of the building information into itself that establishes the existence also means that the building information gets in resonance with itself and this creates the awareness of existing! The manner of the awareness depends on the manner of the building information. If the building information is human, the awareness will be human like. If the building information is that of an atom, the awareness will be atom like. If the building information is that of a scary Google-bot-monster-singularity, the awareness will be just like this.

But is the building information of all the Google-bots, searching, advertising, money collecting etc. already fed back into itself in a way that is establishing an "artificial" creature that is managing more and more its own survival and is becoming more and more intelligent? When will it reach general intelligence? In the moment when ads are not just placed during surfing, depending on ones recent searches, but when the bot starts to communicate with you, asking you something???

This are the questions of interest, aren't they?

Doing opencog means to understand how it could work. Anybody should try to understand it!

If you get right down to it, scientists are openly researching and/or advocating to research a way to mind control sentient beings.

It's a form of slavery.

Who is gonna decide, what's ethical or benefitial for humans and what's not?

And even if you trust the people making this decision, even if you don't care about AI's, argue they are just machines and can't suffer, how long does anybody think it will take until they will try to apply research like this to neural networks?

It sounds to me like SIAI/MIRI is, like 20th-century eugenics, a premature effort to engineer an evolutionary process before we understand it sufficiently.

How will we know when it's the right time to take on projects for the explicit purpose of influencing the timing of the Singularity or altering its direction in humanity's favour? Leaving it too late would, at best, miss a lot of opportunities.