Rationally-Shaped Minds: A Framework for Analyzing Self-Improving AI

Steve Omohundro, Ph.D.

President, Omai Systems

Many believe we are on the verge of creating truly artificially intelligent systems and that these systems will be central to the future functioning of human society. When integrated with biotechnology, robotics, and nanotechnology, these technologies have the potential to solve many of humanity’s perennial problems. But they also introduce a host of new challenges. In this talk we’ll describe the a new approach to analyzing the behavior of these systems.

The modern notion of a “rational economic agent” arose from John von Neumann’s work on the foundations of microeconomics and is central to the design of modern AI systems. It is also relevant in understanding a wide variety of other “intentional systems” including humans, biological organisms, organizations, ecosystems, economic systems, and political systems.

The behavior of fully rational minds is precisely defined and amenable to mathematical analysis. We describe theoretical models within which we can prove that rational systems that have the capability for self-modification will avoid changing their own utility functions and will also act to prevent others from doing so. For a wide class of simple utility functions, uncontrolled rational systems will exhibit a variety of drives: toward self-improvement, self-protection, avoidance of shutdown, self-reproduction, co-opting of resources, uncontrolled hardware construction, manipulation of human and economic systems, etc.

Fully rational minds may be analyzed with mathematical precision but are too computationally expensive to run on today’s computers. But the intentional systems we care about are also not arbitrarily irrational. They are built by designers or evolutionary processes to fulfill specific purposes. Evolution relentlessly shapes creatures to survive and replicate, economies shape corporations to maximize profits, parents shape children to fit into society, and AI designers shape their systems to act in beneficial ways. We introduce a precise mathematical model that we call the “Rationally-Shaped Mind” model for describing this kind of situation. By mathematically analyzing this kind of system, we can better understand and design real systems.

The analysis shows that as resources increase, there is a natural progression of minds from simple stimulus-response systems, to systems that learn, to systems that deliberate, to systems that self-improve. In many regimes, the basic drives of fully rational systems are also exhibited by rationally-shaped systems. So we need to exhibit care as we begin to build this kind of system. On the positive side, we also show that computational limitations can be the basis for cooperation between systems based on Neyman’s work on finite automata playing the iterated Prisoner’s Dilemma.

A conundrum is that to solve the safety challenges in a general way, we probably will need the assistance of AI systems. Our approach to is to work in stages. We begin with a special class of systems designed and built to be intentionally limited in ways that prevent undesirable behaviors while still being capable of intelligent problem solving. Crucial to the approach is the use of formal methods to provide mathematical guarantees of desired properties. Desired safety properties include: running only on specified hardware, using only specified resources, reliably shutting down under specified conditions, limiting self-improvement in precise ways, etc.

The initial safe systems are intended to design a more powerful safe hardware and computing infrastructure. This is likely to include a global “immune system” for protection against accidents and malicious systems. These systems are also meant to help create careful models of human values and to design utility functions for future systems that lead to positive human consequences. They are also intended to analyze the complex game-theoretic dynamics of AI/human ecosystems and to design social contracts that lead to cooperative equilibria.

Minds Making Minds: Artificial Intelligence and the Future of Humanity

Steve Omohundro, Ph.D.

President, Omai Systems

We are at a remarkable moment in human history. Many believe that we are on the verge of major advances in artificial intelligence, biotechnology, nanotechnology, and robotics. Together, these technologies have the potential to solve many of humanity’s perennial problems: disease, aging, war, poverty, transportation, pollution, etc. But they also introduce a host of new challenges and will force us to look closely at our deepest desires and assumptions as we work to forge a new future.

John von Neumann contributed to many aspects of this revolution. In addition to defining the architecture of today’s computers, he did early work on artificial intelligence, self-reproducing automata, systems of logic, and the foundations of microeconomics and game theory. Stan Ulam recalled conversations with von Neumann in the 1950′s in which he argued that we are “approaching some essential singularity in the history of the race”. The modern notion of a “rational economic agent” arose from his work in microeconomics and is central to the design of modern AI systems. We will describe how use this notion to better understand “intentional systems” including artificially intelligent systems but also ourselves, biological organisms, organizations, ecosystems, economic systems, and political systems.

Fully rational minds may be analyzed with mathematical precision but are too computationally expensive to run on today’s computers. But the intentional systems we care about are also not arbitrarily irrational. They are built by designers or evolutionary processes to fulfill specific purposes. Evolution relentlessly shapes creatures to survive and replicate, economies shape corporations to maximize profits, parents shape children to fit into society, and AI designers shape their systems to act in beneficial ways. We introduce a precise mathematical model that we call the “Rationally-Shaped Mind” model which consists of a fully rational mind that designs or adapts a computationally limited mind. We can precisely analyze this kind of system to better understand and design real systems.

This analysis shows that as resources increase, there is a natural progression of minds from simple stimulus-response systems, to systems that learn, to systems that deliberate, to systems that self-improve. It also shows that certain challenging drives arise in uncontrolled intentional systems: toward self-improvement, self-protection, avoidance of shutdown, self-reproduction, co-opting of resources, uncontrolled hardware construction, manipulation of human and economic systems, etc. We describe the work we are doing at Omai Systems to build safe intelligent systems that use formal methods to constrain behavior and to choose goals that align with human values. We envision a staged development of technologies in which early safe limited systems are used to develop more powerful successors and to help us clarify longer term goals. Enormous work will be needed but the consequences will transform the human future in ways that we can only begin to understand today.

Steve Omohundro is a computer scientist who has spent decades designing and writing artificial intelligence software. He now heads a startup corporation, Omai Systems, which will license intellectual property related to AI. In an interview with Sander Olson, Omohundro discuss Apollo style AGI programs, limiting runaway growth in AI systems, and the ultimate limits of machine intelligence.

Question: How long have you been working in the AI field?

It’s been decades. As a student, I published research in machine vision and after my PhD in physics I went to Thinking Machines to develop parallel algorithms for machine vision and machine learning. Later, at the University of Illinois and other research centers, my students and I built systems to read lips, learn grammars, control robots, and do neural learning very efficiently. My current company, Omai Systems, and several other startups I’ve been involved with, develop intelligent technologies.

Question: Is it possible to build a computer which exhibits a high degree of general intelligence but which is not self-aware?

Omai Systems is developing intelligent technologies to license to other companies. We are especially focused on smart simulation, automated discovery, systems that design systems, and programs that write programs. I’ve been working with the issues around self-improving systems for many years and we are developing technology to keep these systems safe. We are working on exciting applications in a number of areas.

I define intelligence as the ability to solve problems using limited resources. It’s certainly possible to build systems that can do that without having a model of themselves. But many goal-driven systems will quickly develop the subgoal of improving themselves. And to do that, they will be driven to understand themselves. There are precise mathematical notions of self-modeling, but deciding whether those capture our intuitive sense of “self-awareness” will only come with more experience with these systems, I think.

Question: Is there a maximum limit to how intelligent an entity can become?

Analyses like Bekenstein’s bound and Bremermann’s limit place physical limits on how much computation physical systems can in principal perform. If the universe is finite, there is only a finite amount of computation that can be performed. If intelligence is based on computation, then that also limits intelligence. But the real interest in AI is in using that computation to solve problems in ever more efficient ways. As systems become smarter, they are likely to be able to use computational resources ever more efficiently. I think those improvements will continue until computational limits are reached. Practically, it appears that Moore’s law still has quite a way to go. And if big quantum computers turn out to be practical, then we will have vast new computational resources available.

Question: You have written extensively of self-improving systems. Wouldn’t such a system quickly get bogged down by resource limitations?

Many junior high students can program computers. And it doesn’t take a huge amount more study to be able to begin to optimize that code. As machines start becoming as smart as humans, they should be able to easily do simple forms of self-improvement. And as they begin to be able to prove more difficult theorems, they should be able to develop more sophisticated algorithms for themselves. Using straightforward physical modeling, they should also be able to improve their hardware. They probably will not be able to reach the absolutely optimal design for the physical resources they have available. But the effects of self-improvement that I’ve written about don’t depend on that in the least. They are very gross drives that should quickly emerge even in very sub-optimal designs.

Question: How would you respond to AI critics who argue that digital computation is not suitable for any form of “thinking”?

They may be right! Until we’ve actually built thinking machines, we cannot know for sure. But most neuroscientists believe that biological intelligence results from biochemical reactions occurring in the brain, and these processes should be able to be accurately simulated using digital computer hardware. But although brute-force approaches like this are likely to work, I believe that there are much better ways to emulate intelligence on digital machines.

Question: The AI field is seen to be divided between the “neat” and “scruffy” approaches. Which side are you on?

John McCarthy coined the term “Artificial Intelligence” in 1956. He started the Stanford Artificial Intelligence Lab with a focus on logical representations and mathematically “neat” theories. Marvin Minsky started the MIT lab and explored more “scruffy” systems based on neural models, self-organization, and learning. I had the privilege of taking classes on proving lisp programs correct with McCarthy and of working with Minsky at Thinking Machines. I have come to see the value of both approaches and my own current work is a synthesis. We need precise logical representations to capture the semantics of the physical world and we need learning, self-organization, and probabilistic reasoning to build rich enough systems to model the world’s complexity.

Question: What is the single biggest impediment to AI development? Lack of funding? Insufficient hardware? An ignorance of how the brain works?

I don’t see hardware as the primary limitation. Today’s hardware can go way beyond what we are doing with it, and it is still rapidly improving. Funding is an issue. People tend to work on tasks for which they can get funding. And most funding is focused on building near term systems based on narrow AI. Brain science is advancing rapidly, but there still isn’t agreement over such basic issues as how memories are encoded, how learning takes place, or how computation takes place. I think there are some fundamental issues we still need to understand.

Question: An Apollo style AGI program would be quite difficult to implement, given the profusion of approaches. Is there any way to address this problem?

The Apollo program was audacious but it involved solving a set of pretty clearly defined problems. The key sub-problems on the road to general AI aren’t nearly as clearly defined yet. I know that Ben Goertzel has published a roadmap claiming that human-level AGI can be created by 2023 for $25 million. He may be right, but I don’t feel comfortable making that kind of prediction. The best way to address the profusion of ideas is to fund a variety of approaches, and to clearly compare different approaches on the same important sub-problems.

Question: Do you believe that a hard takeoff or a soft takeoff is more likely?

What actually happens will depend on both technological and social forces. I believe either scenario is technologically possible. But I think slower development would be preferable. There will be many challenging moral and social choices we will need to make. I believe we will need time to make those choices wisely. We should do as much experimentation and use as much forethought as possible before making irreversible choices.

Question: What is sandboxing technology?

Sandboxing runs possibly dangerous systems in protected simulation environments to keep them from causing damage. It is used in studying the infection mechanisms of computer viruses, for example. People have suggested that it might be a good way to keep AI systems safe as we experiment with them.

Question: So is it feasible to create a sandboxing system that effectively limits an intelligent machine’s ability to interface with the outside world?

Eliezer Yudkowsky did a social experiment in which he played the AI and tried to convince human operators to let him out of the sandbox. In several of his experiments he was able to convince people to let him out of the box, even though they had to pay fairly large sums of real money for doing so. At Omai Systems we are taking a related, but different, approach which uses formal methods to create mathematically provable limitations on systems. The current computing and communications infrastructure is incredibly insecure. One of the first tasks for early safe AI systems will be to help design an improved infrastructure.

Question: If you had a multibillion dollar budget, what steps would you take to rapidly bring about AGI?

I don’t think that rapidly bringing about AGI is the best initial goal. I would feel much better about it if we had a clear roadmap for how these systems will be safely integrated into society for the benefit of humanity. So I would be funding the creation of that kind of roadmap and deeply understanding the ramifications of these technologies. I believe the best approach will be to develop provably limited systems and to use those in designing more powerful ones that will have a beneficial impact.

Question: What is your concept of the singularity? Do you consider yourself a singulitarian?

Although I think the concept of a singularity is fascinating, I am not a proponent of the concept. The very term singularity presupposes the way that the future will unfold. And I don’t think that presupposition is healthy because I believe a slow and careful unfolding is preferable to a rapid and unpredictable one.

Design Principles for a Safe and Beneficial AGI Infrastructure

Steve Omohundro, Ph.D., Omai Systems

Abstract:

Many believe we are on the verge of creating true AGIs and that these systems will be central to the future functioning of human society. These systems are likely to be integrated with 3 other emerging technologies: biotechnology, robotics, and nanotechnology. Together, these technologies have the potential to solve many of humanity’s perennial problems: disease, aging, war, poverty, transportation, pollution, etc. But they also introduce a host of new challenges. As AGI scientists, we are in a position to guide these technologies for the greatest human good. But what guidelines should we follow as we develop our systems?

This talk will describe the approach we are taking at Omai Systems to develop intelligent technologies in a controlled, safe, and positive way. We start by reviewing the challenging drives that arise in uncontrolled intentional systems: toward self-improvement, self-protection, avoidance of shutdown, self-reproduction, co-opting of resources, uncontrolled hardware construction, manipulation of human and economic systems, etc.

One conundrum is that to solve these problems in a general way, we probably will need the assistance of AGI systems. Our approach to solving this is to work in stages. We begin with a special class of systems designed and built to be intentionally limited in ways that prevent undesirable behaviors while still being capable of intelligent problem solving. Crucial to the approach is the use of formal methods to provide mathematical guarantees of desired properties. Desired safety properties include: running only on specified hardware, using only specified resources, reliably shutting down under specified conditions, limiting self-improvement in precise ways, etc.

The initial safe systems are intended to design a more powerful safe hardware and computing infrastructure. This is likely to include a global “immune system” for protection against accidents and malicious systems. These systems are also meant to help create careful models of human values and to design utility functions for future systems that lead to positive human consequences. They are also intended to analyze the complex game-theoretic dynamics of AGI/human ecosystems and to design social contracts that lead to cooperative equilibria.

The future of humanity involves a complex combination of technological, psychological and social factors – and one of the difficulties we face in comprehending and crafting this future, is that not many people or organizations are adept at handling all these aspects. Dr. Stephen Omohundro is one of the fortunate exceptions to this general pattern, and this is part of what gives his contributions to the futurist domain such a unique and refreshing twist.

Steve has a substantial pedigree and experience in the hard sciences, beginning with degrees in Mathematics and Physics from Stanford and a Ph.D. in Physics from U.C. Berkeley. He was a professor in the computer science department at the University of Illinois at Champaign-Urbana, cofounded the Center for Complex Systems Research, authored the book “Geometric Perturbation Theory in Physics”, designed the programming languages StarLisp and Sather, wrote the 3D graphics system for Mathematica, and built systems which learn to read lips, control robots, and induce grammars. I’ve had some long and deep discussions with Steve about advanced artificial intelligence, both my own approach and his own unique AI designs.

But he has also developed considerable expertise and experience in understanding and advising human minds and systems. Via his firm Self-Aware Systems, he has worked with clients using a variety of individual and organizational change processes including Rosenberg’s Non-Violent Communication, Gendlin’s Focusing, Travell’s Trigger Point Therapy, Bohm’s Dialogue, Beck’s Life Coaching, and Schwarz’s Internal Family Systems Therapy.

Steve’s papers and talks on the future of AI, society and technology – including The Wisdom of the Global Brain and Basic AI Drives — reflect this dual expertise in technological and human systems. In this interview I was keen to mine his insights regarding the particular issue of the risks facing the human race as we move forward along the path of accelerating technological develoment.

Ben:

A host of individuals and organizations — Nick Bostrom, Bill Joy, the Lifeboat Foundation, the Singularity Institute, and the Millennium Project, to name just a few — have recently been raising the issue of the “existential risks” that advanced technologies may post to the human race. I know you’ve thought about the topic a fair bit as well, both from the standpoint of your own AI work and more broadly. Could you share the broad outlines of your thinking in this regard?

Steve:

I don’t like the phrase “existential risk” for several reasons. It presupposes that we are clear about exactly what “existence” we are risking. Today, we have a clear understanding of what it means for an animal to die or a species to go extinct. But as new technologies allow us to change our genomes and our physical structures, it will become much less clear when we have lost something precious. Death and extinction become much more amorphous concepts in the presence of extensive self-modification.

It’s easy to identify our humanity with our individual physical form and our egoic minds. But in reality our physical form is an ecosystem, only 10% of our cells are human. And our minds are also ecosystems composed of interacting subpersonalities. And our humanity is as much in our relationships, interconnections, and culture as it is in our individual minds and bodies. The higher levels of organization are much more amorphous and changeable and it will be hard to pin down when something precious is lost.

So, I believe the biggest “existential risk” is related to identifying the qualities that are most important to humanity and to ensuring that technological forces enhance those rather than eliminate them. Already today we see many instances where economic forces act to create “soulless” institutions that tend to commodify the human spirit rather than inspire and exalt it.

Some qualities that I see as precious and essentially human include: love, cooperation, humor, music, poetry, joy, sexuality, caring, art, creativity, curiosity, love of learning, story, friendship, family, children, etc. I am hopeful that our powerful new technologies will enhance these qualities. But I also worry that attempts to precisely quantify them may in fact destroy them. For example, the attempts to quantify performance in our schools using standardized testing have tended to inhibit our natural creativity and love of learning.

Perhaps the greatest challenge that will arise from new technologies will be to really understand ourselves and identify our deepest and most precious values.

Ben:

Yes…. After all, “humanity” is a moving target, and today’s humanity is not the same as the humanity of 500 or 5000 years ago, and humanity of 100 or 5000 years from now – assuming it continues to exist – will doubtless be something dramatically different. But still there’s been a certain continuity throughout all these changes, and part of that doubtless is associated with the “fundamental human values” that you’re talking about.

Still, though, there’s something that nags at me here. One could argue that none of these precious human qualities are practically definable in any abstract way, but they only have meaning in the context of the totality of human mind and culture. So that if we create a fundamentally nonhuman AGI that satisfies some abstracted notion of human “family” or “poetry”, it won’t really satisfy the essence of “family” or “poetry”. Because the most important meaning of a human value doesn’t lie in some abstract characterization of it, but rather in the relation of that value to the total pattern of humanity. In this case, the extent to which a fundamentally nonhuman AGI or cyborg or posthuman or whatever would truly demonstrate human values, would be sorely limited. I’m honestly not sure what I think about this train of thought. I wonder what’s your reaction.

Steve:

That’s a very interesting perspective! In fact it meshes well with a perspective I’ve been slowly coming to, which is to think of the totality of humanity and human culture as a kind of “global mind”. As you say, many of our individual values really only have meaning in the context of this greater whole. And perhaps it is this greater whole that we should be seeking to preserve and enhance. Each individual human lives only for a short time but the whole of humanity has a persistence and evolution beyond any individual. Perhaps our goal should be to create AGIs that integrate, preserve, and extend the “global human mind” rather than trying solely to mimic individual human minds and individual human values.

Ben:

Perhaps a good way to work toward this is to teach our nonhuman or posthuman descendants human values by example, and by embedding them in human culture so they absorb human values implicitly, like humans do. In this case we don’t need to “quantify” or isolate our values to pass them along to these other sorts of minds….

Steve:

That sounds like a good idea. In each generation, the whole of human culture has had to pass through a new set of minds. It is therefore well adapted to being learned. Aspects which are not easily learnable are quickly eliminated. I’m fascinated by the process by which each human child must absorb the existing culture, discover his own values, and then find his own way to contribute. Philosophy and moral codes are attempts to codify and abstract the learnings from this process but I think they are no substitute for living the experiential journey. AGIs which progress in this way may be much more organically integrated with human society and human nature. One challenging issue, though, is likely to be the mismatch of timescales. AGIs will probably rapidly increase in speed and keeping their evolution fully integrated with human society may become a challenge.

Ben:

Yes, it’s been amazing to watch that learning process with my own 3 kids, as they grow up.

It’s great to see that you and I seem to have a fair bit of common understanding on these matters. This reminds me, though, that a lot of people see these things very, very differently. Which leads me to my next question: What do you think are the biggest misconceptions afoot, where existential risk is concerned?

Steve:

I don’t think the currently fashionable fears like global warming, ecosystem destruction, peak oil, etc. will turn out to be the most important issues. We can already see how emerging technologies could, in principle, deal with many of those problems. Much more challenging are the core issues of identity, which the general public hasn’t really even begun to consider. Current debates about stem cells, abortion, cloning, etc. are tiny precursors of the deeper issues we will need to explore. And we don’t really yet have a system for public discourse or decision making that is up to the task.

Ben:

Certainly a good point about public discourse and decision making systems. The stupidity of most YouTube comments, and the politicized (in multiple senses) nature of the Wikipedia process, makes clear that online discourse and decision-making both need a lot of work. And that’s not even getting into the truly frightening tendency of the political system to reduce complex issues to oversimplified caricatures.

Given the difficulty we as a society currently have in talking about, or making policies about, things as relatively straightforward as health care reform or marijuana legalization or gun control, it’s hard to see how our society could coherently deal with issues related to, say, human-level AGI or genetic engineering of novel intelligent lifeforms!

For instance, the general public’s thinking about AGI seems heavily conditioned by science-fiction movies like Terminator 2, which clouds consideration of the deep and in some ways difficult issues that you see when you understand the technology a little better. And we lack the systems needed to easily draw the general public into meaningful dialogues on these matters with the knowledgeable scientists and engineers.

So what’s the solution? Do you have any thoughts on what kind of system might work better?

Steve:

I think Wikipedia has had an enormous positive influence on the level of discourse in various areas. It’s no longer acceptable to plead ignorance of basic facts in a discussion. Other participants will just point to a Wikipedia entry. And the rise of intelligent bloggers with expertise in specific areas is also having an amazing impact. One example I’ve been following closely are debates and discussions about various approaches to diet and nutrition.

A few years back, T. Colin Campbell’s “The China Study” was promoted as the most comprehensive study of nutrition, health, and diet ever conducted. The book and the study had a huge influence on people’s thinking about health and diet. A few months ago, 22 year old English major Denise Minger decided to reanalyze the data in the study and found that they did not support the original conclusions. She wrote about her discoveries on her blog and sparked an enormous discussion all over the health and diet blogosphere that dramatically shifted many people’s opinions. The full story can be heard in her interview.

It would have been impossible for her to have had that kind of impact just a few years ago. The rapidity with which incorrect ideas can be corrected and the ease with which many people can contribute to new understanding is just phenomenal. I expect that systems to formalize and enhance that kind of group thinking and inquiry will be created to make it even more productive.

Ben:

Yes, I see – that’s a powerful example. The emerging Global Brain is gradually providing us the tools needed to communicate and collectively think about all the changes that are happening around and within us. But it’s not clear if the communication mechanisms are evolving fast enough to keep up with the changes we need to discuss and collectively digest….

On the theme of rapid changes, let me now ask you something a little different — about AGI…. I’m going to outline two somewhat caricaturish views on the topic and then probe your reaction to them!

First of all, one view on the future of AI and the Singularity is that there is an irreducible uncertainty attached to the creation of dramatically greater than human intelligence. That is, in this view, there probably isn’t really any way to eliminate or drastically mitigate the existential risk involved in creating superhuman AGI. So, in this view, building superhuman AI is essentially plunging into the Great Unknown and swallowing the risk because of the potential reward.

On the other hand, an alternative view is that if we engineer and/or educate our AGI systems correctly, we can drastically mitigate the existential risk associated with superhuman AGI, and create a superhuman AGI that’s highly unlikely to pose an existential risk to humanity.

What are your thoughts on these two perspectives?

Steve:

I think that, at this point, we have tremendous leverage in choosing how we build the first intelligent machines and in choosing the social environment that they operate in. We can choose the goals of those early systems and those choices are likely to have a huge effect on the longer-term outcomes. I believe it is analogous to choosing the constitution for a country. We have seen that the choice of governing rules has an enormous effect on the quality of life and the economic productivity of a population.

Ben:

That’s an interesting analogy. And an interesting twist on the analogy may be the observation that to have an effectively working socioeconomic system, you need both good governing rules, and a culture oriented to interpreting and implementing the rules sensibly. In some countries (e.g. China comes to mind, and the former Soviet Union) the rules as laid out formally are very, very different from what actually happens. The reason I mention this is: I suspect that in practice, no matter how good the “rules” underlying an AGI system are, if the AGI is embedded in a problematic culture, then there’s a big risk for something to go awry. The quality of any set of rules supplied to guide an AGI is going to be highly dependent on the social context…

Steve:

Yes, I totally agree! The real rules are a combination of any explicit rules written in lawbooks and the implicit rules in the social context. Which highlights again the importance for AGIs to integrate smoothly into the social context.

Ben:

One might argue that we should first fix some of the problems of our cultural psychology, before creating an AGI and supplying it with a reasonable ethical mindset and embedding it in our culture. Because otherwise the “embedding in our culture” part could end up unintentionally turning the AGI to the dark side!! Or on the other hand, maybe AGI could be initially implemented and deployed in such a way as to help us get over our communal psychological issues…. Any thoughts on this?

Steve:

Agreed! Perhaps the best outcome would be technologies that first help us solve our communal psychological issues and then as they get smarter evolve with us in an integrated fashion.

Ben:

On the other hand, it’s not obvious to me that we’ll be able to proceed that way, because of the probability – in my view at any rate – that we’re going to need to rely on advanced AGI systems to protect us from other technological risks.

For instance, one approach that’s been suggested, in order to mitigate existential risks, is to create a sort of highly intelligent “AGI Nanny” or “Singularity Steward.” This would be a roughly human-level AGI system without capability for dramatic self-modification, and with strong surveillance powers, given the task of watching everything that humans do and trying to ensure that nothing extraordinarily dangerous happens. One could envision this as a quasi-permanent situation, or else as a temporary fix to be put into place while more research is done regarding how to launch a Singularity safely.

Any thoughts on the sort of AI Nanny scenario?

Steve:

I think it’s clear that we will need a kind of “global immune system” to deal with inadvertent or intentional harm arising from powerful new technologies like biotechnology and nanotechnology. The challenge is to make protective systems powerful enough for safety but not so powerful that they themselves become a problem. I believe that advances in formal verification will enable us to produce systems with provable properties of this type. But I don’t believe this kind of system on its own will be sufficient to deal with the deeper issues of preserving the human spirit.

Ben:

What about the “one AGI versus many” issue? One proposal that’s been suggested, to mitigate the potential existential risk of human-level or superhuman AGIs, is to create a community of AGIs and have them interact with each other, comprising a society with its own policing mechanisms and social norms and so forth. The different AGIs would then keep each other in line. A “social safety net” so to speak.

Steve:

I’m much more drawn to “ecosystem” approaches which involve many systems of different types interacting with one another in such a way that each acts to preserve the values we care about. I think that alternative singleton “dictatorship” approaches could also work but they feel much more fragile to me in that design mistakes might become rapidly irreversible. One approach to limiting the power of individuals in an ecosystem is to limit the amount of matter and free energy they may use while allowing them freedom within those bounds. A challenge to that kind of constraint is the formation of coalitions of small agents that act together to overthrow the overall structure. But if we build agents that want to cooperate in a defined social structure, then I believe the system can be much more stable. I think we need much more research into the space of possible social organizations and their game theoretic consequences.

Ben:

Finally – bringing the dialogue back to the practical and near-term – I wonder what you think society could be doing now to better militate against existential risks … from AGI or from other sources?

Steve:

Much more study of social systems and their properties, better systems for public discourse and decision making, deeper inquiry into human values, improvements in formal verification of properties in computational systems.

Ben:

That’s certainly sobering to consider, given the minimal amount of societal resources currently allocated to such things, as opposed to for example the creation of weapons systems, better laptop screens or chocolaty-er chocolates!

To sum up, it seems one key element of your perspective is the importance of deeper collective (and individual) self-understanding – deeper intuitive and intellectual understanding of the essence of humanity. What is humanity, that it might be preserved as technology advances and wreaks its transformative impacts? And another key element is your view is that social networks of advanced AGIs are more likely to help humanity grow and preserve its core values, than isolated AGI systems. And then there’s your focus on the wisdom of the global brain. And clearly there are multiple connections between these elements, for instance a focus on the way ethical, aesthetic, intellectual and other values emerge from social interactions between minds. It’s a lot to think about … but fortunately none of us has to figure it out on our own!