Tim Finin

Professor of Computer Science and Electrical Engineering, University of MarylandCited by 20832

Tim Finin is a Professor of Computer Science and Electrical Engineering at the University of Maryland, Baltimore County (UMBC). He has over 30 years of experience in applications of Artificial Intelligence to problems in information systems and language understanding. His current research is focused on the Semantic Web, mobile computing, analyzing and extracting information from text and online social media, and on enhancing security and privacy in information systems.

Finin received an S.B. degree in Electrical Engineering from MIT and a Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign. He has held full-time positions at UMBC, Unisys, the University of Pennsylvania, and the MIT AI Laboratory. He is the author of over 300 refereed publications and has received research grants and contracts from a variety of sources. He participated in the DARPA/NSF Knowledge Sharing Effort and helped lead the development of the KQML agent communication language and was a member of the W3C Web Ontology Working Group that standardized the OWL Semantic Web language.

Finin has chaired of the UMBC Computer Science Department, served on the board of directors of the Computing Research Association, been a AAAI councilor, and chaired several major research conferences. He is currently an editor-in-chief of the Elsevier Journal of Web Semantics.

Pat Hayes

Pat Hayes has a BA in mathematics from Cambridge University and a PhD in Artificial Intelligence from Edinburgh. He has been a professor of computer science at the University of Essex and philosophy at the University of Illinois, and the Luce Professor of cognitive science at the University of Rochester. He has been a visiting scholar at Universite de Geneve and the Center for Advanced Study in the Behavioral Studies at Stanford, and has directed applied AI research at Xerox-PARC, SRI and Schlumberger, Inc.. At various times, Pat has been secretary of AISB, chairman and trustee of IJCAI, associate editor of Artificial Intelligence, a governor of the Cognitive Science Society and president of AAAI.

Pat's research interests include knowledge representation and automatic reasoning, especially the representation of space and time; the semantic web; ontology design; image description and the philosophical foundations of AI and computer science. During the past decade Pat has been active in the Semantic Web initiative, largely as an invited member of the W3C Working Groups responsible for the RDF, OWL and SPARQL standards. Pat is a member of the Web Science Trust and of OASIS, where he works on the development of ontology standards.

In his spare time, Pat restores antique mechanical clocks and remodels old houses. He is also a practicing artist, with works exhibited in local competitions and international collections. Pat is a charter Fellow of AAAI and of the Cognitive Science Society, and has professional competence in domestic plumbing, carpentry and electrical work.

The Interview:

Brandon Rohrer: This is an entertaining survey. I appreciate the specificity with which you've worded some of the questions. I don't have a defensible or scientific answer to any of the questions, but I've included some answers below that are wild-ass guesses. You got some good and thoughtful responses. I've been enjoying reading them. Thanks for compiling them.

Q1:Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of roughly human-level machine intelligence?

Pat Hayes: I do not consider this question to be answerable, as I do not accpet this (common) notion of "human-level intelligence" as meaningful. Artificially intelligent artifacts are in some ways superhuman, and have been for many years now; but in other ways, they are sub-human, or perhaps it would be better to say, non-human. They simply differ from human intelligences, and it is inappropriate to speak of "levels" of intelligence in this way. Intelligence is too complex and multifacetted a topic to be spoken of as though it were something like sea level that can be calibrated on a simple linear scale.

If by 'human-level' you mean, the AI will be an accurate simalcrum of a human being, or perhaps a human personality (as is often envisioned in science fiction, eg HAL from "2001") my answer would be, never. We will never create such a machine intelligence, because it is probably technically close to impossible, and not technically useful (note that HAL failed in its mission through being TOO "human": it had a nervous breakdown. Bad engineering.) But mostly because we have absolutely no need to do so. Human beings are not in such short supply at resent that it makes sense to try to make artificial ones at great cost. And actual AI work, as opposed to the fantasies often woven around it by journalists and futurists, is not aiming to create such things. A self-driving car is not an artificial human, but it is likely to be a far better driver than any human, because it will not be limited by human-level attention spans and human-level response times. It will be, in these areas, super-human, just as present computers are superhuman at calculation and keeping track of large numbers of complex patterns, etc.. .

Q2:What probability do you assign to the possibility of human extinction as a result of badly done AI?

Explanatory remark to Q2:

P(human extinction | badly done AI) = ?

(Where 'badly done' = AGI capable of self-modification that is not provably non-dangerous.)

Brandon Rohrer: < 1%

Tim Finin: 0.001

Pat Hayes: Zero. The whole idea is ludicrous.

Q3: What probability do you assign to the possibility of a human level AGI to self-modify its way up to massive superhuman intelligence within a matter of hours/days/< 5 years?

Pat Hayes: Again, zero. Self-modification in any useful sense has never been technically demonstrated. Machine learning is possible and indeed is a widely used technique (no longer only in AI) but a learning engine is the same thing after it has learnt something as it was before., just as biological learners are. When we learn, we get more informed, but not more intelligent: similarly with machines.

Q4: Is it important to figure out how to make AI provably friendly to us and our values (non-dangerous), before attempting to solve artificial general intelligence?

Explanatory remark to Q4:

How much money is currently required to mitigate possible risks from AI (to be instrumental in maximizing your personal long-term goals, e.g. surviving this century), less/no more/little more/much more/vastly more?

Brandon Rohrer: No more.

Tim Finin: No.

Pat Hayes: No. There is no reason to suppose that any manufactured system will have any emotional stance towards us of any kind, friendly or unfriendly. In fact, even if the idea of "human-level" made sense, we could have a more-than-human-level super-intelligent machine, and still have it bear no emotional stance towards other entities whatsoever. Nor need it have any lust for power or political ambitions, unless we set out to construct such a thing (which AFAIK, nobody is doing.) Think of an unworldly boffin who just wants to be left alone to think, and does not care a whit for changing the world for better or for worse, and has no intentions or desires, but simply answers questions that are put to it and thinks about htings that it is asked to think about. It has no ambition and in any case no means to achieve any far-reaching changes even if it "wanted" to do so. It seems to me that this is what a super-intelligent question-answering system would be like. I see no inherent, even slight, danger arising from the presence of such a device.

What existential risk (human extinction type event) is currently most likely to have the greatest negative impact on your personal long-term goals, under the condition that nothing is done to mitigate the risk?

Pat Hayes: No. Nanotechnology has the potential to make far-reaching changes to the actual physical environment. AI poses no such threat. Indeed, I do not see that AI itself (that is, actual AI work being done, rather than the somewhat uninformed fantasies that some authors, such as Ray Kurtzwiel, have invented) poses any serious threat to anyone.

I would say that any human-extinction type event is likely to make a serious dent in my personal goals. (But of course I am being sarcastic, as the question as posed seems to me to be ridiculous.)

When I think of the next century, say, the risk I amost concerned about is global warming and the resulting disruption to the biosphere and human society. I do not think that humans will become extinct, but I think that our current global civilization might not survive.

Q6: What is the current level of awareness of possible risks from AI, relative to the ideal level?

Brandon Rohrer: High.

Tim Finin: About right.

Pat Hayes: The actual risks are negligible: the perceived risks (thanks to the popularization of such nonsensical ideas as the "singularity") are much greater.

Q7:Can you think of any milestone such that if it were ever reached you would expect human窶人evel machine intelligence to be developed within five years thereafter?

Brandon Rohrer: No, but the demonstrated ability of a robot to learn from its experience in a complex and unstructured environment is likely to be a milestone on that path, perhaps signalling HLI is 20 years away.

Tim Finin: Passing a well constructed, open ended turing test.

Pat Hayes: No. There are no 'milestones' in AI. Progress is slow but steady, and there are no magic bullets.

Anonymous

The following are replies from experts who either did not answer the questions for various reasons or didn't want them to be published.

Expert 1: Sorry, I don't want to do an email interview - it is too hard to qualify comments.

Expert 2: Thanks for your inquiry - but as you note I am a roboticist and not a futurist, so I generally try to avoid speculation.

Expert 3: my firmest belief about the timeline for human-level AI is that we can't estimate it usefully. partly this is because i don't think "human level AI" will prove to be a single thing (or event) that we can point to and say "aha there it is!". instead i think there will be a series of human level abilities that are achieved. in fact some already have (though many more haven't).

(on the other hand, i think shooting for human-level AI is a good long term research goal. it doesn't need to be one thing in the end to be a good focus of work.)

another important catch, with respect to the "risk from human level AI" equation, is that i don't think human level AI immediately leads to super-human level AI. we have had many human-level human's working on AI for a long time, and haven't added up to even a single human. i don't think it's is necessarily (or even likely) the case that a human level AI would have much more luck at making itself smarter than we have been....

Expert 4: Thanks for this - fascinating questions, and I am a great supporter of probability elicitation, but only from people who are well-informed about the subject-matter! And I am afraid this does not include me - I am sure I should know more about this, but I don't, and so am unwilling to express publicly any firm opinion.

Of course in private in a bar I may be more forthcoming!

Expert 5: Interesting questions, I'll enjoy seeing your published results! Unfortunately, now that I work at ****** (through the acquisition of one of my companies, ******), there are policies in place that prohibit me from participating in this kind of exercise.

Expert 6: I don't think I can answer your questions in a meaningful way...

Expert 7: Thanks for your interest. I feel that this is not in the area of my primary expertise. However, I'd refer you to ****** ( a colleague, and co-chair of the *******) who I think might be in a better position to give you current and informed answers.

Expert 8: Unfortunately, most of these questions do not have a simple answer, in my opinion, so I can't just say "five years" or whatever -- I would have to write a little essay in order to give an answer that reflects what I really believe. For example, the concept of "roughly human-level intelligence" is a complicated one, and any simple answer would be misleading. By some measures we're already there; by other measures, the goal is still far in the future. And I think that the idea of a "provably friendly" system is just meaningless.

Anyway, good luck with your survey. I'm sure you'll get simple answers from some people, but I suspect that you will find them confusing or confused.

Expert 9: Thank you for your email. I do not feel comfortable answering your questions for a public audience.

Expert 10: sorry no reply for such questions

Expert 11: I regard speculation about AI as a waste of time. We are at an impasse: none of our current techniques seems likely to provide truly human-like intelligence. I think what's needed is a conceptual breakthrough from someone comparable to Newton or Einstein. Until that happens, we're going to remain stuck, although there will be lots of useful technology coming along. It won't be "intelligent" or "conscious" the way humans are, but it might do a really good job of guessing what movies we want to watch or what news stories interest us the most.

Given our current state of ignorance, I feel that speculating about either the timeline or the impact of AI is best left to science fiction writers.

More interviews forthcoming (hopefully). At least one person told me that the questions are extremely important and that he would work out some answers over the next few days.

Q: Are you familiar with formal concepts of optimal AI design which relate to searches over complete spaces of computable hypotheses or computational strategies, such as Solomonoff induction, Levin search, Hutter's algorithm M, AIXI, or Gödel machines?

FWIW, yes, I am familiar with Solomonoff induction, Hutter's algorithm and this field generally. Other than being part of the background theory of computer science and computability/complexity theory, I do not consider it to be directly relevant to actual AI practice, nor that it will produce any kind of quantum leap in general AI (or indeed in general computer science or computer engineering.)

I note that some of your commentators have distinguished AI from General AI, or GAI, a new "field". As far as I can tell, GAI has not yet actually achieved anything, and seems to be predicated on a false view of the reality of AI work. I strongly suspect that the only reason for the existence of GAI is a frustration with the fact that most experts in actual AI do not find the more hysterical dystopian AI futurism plausible.

I agree that AI is to computers as intelligence is to brains. Faster electronics and more transistors only equate to faster neurones and more synapses. Intelligence is the manner in which the neurones are applied and that is a totally different view.

We (my antecedents and myself) have studied human intelligence for around 60 years and have developed the manner in which humans think and make rational decisions. We have produced software that thinks faster and better than humans. It adapts to ANY situation (mining, retail, banking, consulting, shopping, arguing, fixing...) and even has moods.

So far the interest has been poor, probably due to ignorance, ego, disbelief or jealousy - who knows - but we can defend the design and the design philosophy in any level of detail.

I think experts' opinions on the possibility of AI self-improvement may covary with their awareness of work on formal, machine-representable concepts of optimal AI design, particularly Solomonoff induction, including its application to reinforcement learning as in AIXI, and variations of Levin search such as Hutter's algorithm M and Gödel machines. If an expert is unaware of those concepts, this unawareness may serve to explain away the expert's belief that there are no approaches to engineering self-improvement-capable AI on any foreseeable horizon.

If it's not too late, you should probably include a question to judge the expert's awareness of these concepts in your questionnaires, such as:

"Qn: Are you familiar with formal concepts of optimal AI design which relate to searches over complete spaces of computable hypotheses or computational strategies, such as Solomonoff induction, Levin search, Hutter's algorithm M, AIXI, or Gödel machines?"

...bearing in mind that the presence of such a question may affect their other answers.

(This was part of what I was getting at with my analysis of the AAAI panel interim report: "What cached models of the planning abilities of future machine intelligences did the academics have available [...]?" "What fraction of the academics are aware of any current published AI architectures which could reliably reason over plans at the level of abstraction of 'implement a proxy intelligence'?")

Other errors which might explain away an expert's unconcern for AI risk are:

when considering AI self-improvement scenarios, incautious thinking about parameter uncertainty and structural uncertainty in economic descriptions of computational complexity costs and efficiency gains over time (particularly given that a general AI will be motivated to investigate many different possible structures for the process for self-improvement, including structures one may not oneself have considered, in order to choose a process whose economics are as favorable as possible); and

incomplete reasoning about options for gathering information about technical factors affecting AI risk scenarios, when considering the potential relative costs of delaying AI safety projects until better information is available (on the implicit expectation that, in the event that the technical factors turn out to imply safety, delaying will have prevented the cost of the AI safety projects, and (more viscerally) that having advocated delay will prevent one's own loss of prestige, unthinkingly taken as a proxy for correctness, whereas failure to have advocated an immediate start to AI safety projects could not result in loss of one's own prestige in any event).

However, it's harder to find uncontroversial questions which would be diagnostic of these errors.

However, it's harder to find uncontroversial questions which would be diagnostic of these errors.

Perhaps an expert's beliefs about the costs of better information and the costs of delay might be assessed with a willingness-to-pay question, such as a tradeoff involving a hypothetical benefit to everyone now living on Earth which could be sacrificed to gain hypothetical perfect understanding of some technical unknowns related to AI risks, or a hypothetical benefit gained at the cost of perfect future helplessness against AI risks. However, even this sort of question might seem to frame things hyperbolically.

First, keep up the good work of asking the experts! Also, I am glad that Pat Hayes actually bothered answering the questions with more than just a dismissive "no".

He has an excellent point that AI is not evolving anywhere close to the same path as human intelligence. It is more intelligent than people in many areas, while is total savant in others. Its development is not guided by natural selection. There is so little predictability of which way the AI development would go, I find it ridiculous to compare AI and human intelligence. Maybe after a couple more decades of actual research (possibly including the decision theory) will lift the fog a bit, who knows.

Pat Hayes: No. There is no reason to suppose that any manufactured system will have any emotional stance towards us of any kind, friendly or unfriendly. In fact, even if the idea of "human-level" made sense, we could have a more-than-human-level super-intelligent machine, and still have it bear no emotional stance towards other entities whatsoever.

Exactly, it doesn't care about humans. It isn't friendly to them. Non-friendly. That's what unfriendly means as a technical word in this context. Not 'nasty' or malicious. Just not friendly. That should be terrifying.

Nor need it have any lust for power or political ambitions, unless we set out to construct such a thing (which AFAIK, nobody is doing.) Think of an unworldly boffin who just wants to be left alone to think, and does not care a whit for changing the world for better or for worse, and has no intentions or desires, but simply answers questions that are put to it and thinks about htings that it is asked to think about.

Boom! A light cone of computronium. Ooops.

What does a 'boffin' do when it wants to answer a question it doesn't yet have an answer for? It researchers, studies and thinks. A general intelligence that only cares about answering the question given to it does just that. As effectively as it can with the resources it has available to it. Unless it is completely isolated from all external sources of information it will proceed directly to creating more of itself as soon as it has been given a difficult question. The very best you could hope for if the question answer is completely isolated is an AI Box. If Pat is the gatekeeper then R. I. P. humanity.

It has no ambition and in any case no means to achieve any far-reaching changes even if it "wanted" to do so. It seems to me that this is what a super-intelligent question-answering system would be like. I see no inherent, even slight, danger arising from the presence of such a device.

A general intelligence that only cares about answering the question given to it does just that. As effectively as it can with the resources it has available to it. Unless it is completely isolated from all external sources of information it will proceed directly to creating more of itself as soon as it has been given a difficult question. The very best you could hope for if the question answer is completely isolated is an AI Box. If Pat is the gatekeeper then R. I. P. humanity.

This need not be the case. Whenever we talk about software "wanting" something, we are of course speaking metaphorically. It might be straightforward to build a super-duper Watson or Wolfram Alpha, that responds to natural queries "intelligently", without the slightest propensity to self-modify or radically alter the world. You might even imagine such a system having a background thread trying to pre-compute answers to interesting questions and share them with humans, once per day, without any ability to self-modify or significant probability of radical alteration to human society.

You have a point, but a powerful question-answering device can be dangerous even if it stays inside the box. You could ask it how to build nanotech. You could ask it how to build an AI that would uphold national security. You could ask it who's likely to commit a crime tomorrow, and receive an answer that manipulates you to let the crime happen so the prediction stays correct.

This depends how powerful the answerer is. If it's as good as a human expert, it's probably not dangerous -- at least, human experts aren't. Certainly, I would rather keep such a system out of the hands of criminals or the insane -- but it doesn't seem like that system, alone, would be a serious risk to humanity.

After taking a look at the research pages, I'm not very afraid of these people, at least not until they get computers powerful enough to brute-force AGI by simulated evolution or some other method. I'm more afraid of Shane Legg who does top-notch technical work (far beyond anything I'm capable of), understands the danger of uFAI and ranks it as the #1 existential risk, and still cheers for stuff like Monte Carlo AIXI. I'm afraid of Abram Demski who wrote brilliant comments on LW and still got paid to help design a self-improving AGI (Genifer).

24 out of 26?! Since Eliezer won his first two, I was already reasonably certain that AI boxing is effectively impossible (at least once you give it the permission to talk to some humans), so I won't meaningfully update here. But this piece of evidence was quite unexpected.

A mind designed by evolution could be big and messy, about as complex as the human brain. Right now we have no computer powerful enough to simulate even a single human brain, and evolution requires many of those. Of course there are many possible shortcuts, but we don't seem to be there yet.

All of the latter has been evolved in a digital environment with no additional expert knowledge of humans. Sooner or later, we will be evolving pretty much everything. All the big talk about the AI of some web experts aside.

Why? These guys think things are going to be fine. You should raise your probability estimate that humanity will survive the next century. This is great news!

Or, if you have reason to believe that things are not going to be fine it may be appropriate to lower your estimate that humanity will survive the next century. People not being aware (or denying) threats are less likely to do what is necessary to prevent them. If we accept XiXidu's implied premise that these guys are particularly relevant then their belief that things are fine is an existential risk.

(It happens that I don't accept the premise. Narrow AI is a completely different subject to GAI and experts are notorious for overestimating the extent that their expertise applies to loosely related areas.)

Or, if you have reason to believe that things are not going to be fine it may be appropriate to lower your estimate that humanity will survive the next century

Okay, but this seems to violate conservation of expected evidence. Either you can be depressed by the answer "we're all going to die" or, less plausibly, by the answer "Everything is going to be fine", but not both.

Either you can be depressed by the answer "we're all going to die" or, less plausibly, by the answer "Everything is going to be fine", but not both.

I only suggested the latter, never the former. I'd be encouraged if the AI researchers acknowledged more risk. (Only slightly given the lack of importance I have ascribed to these individuals elsewhere.)

If we accept XiXidu's implied premise that these guys are particularly relevant then their belief that things are fine is an existential risk.

How do you know who is going to have the one important insight that leads to a dangerous advance? If I write everyone then they have at least heard of risks from AI and maybe think twice when they notice something dramatic.

Also my premise is mainly that those people are influential. After all they have students, coworkers and friends with whom they might talk about risks from AI. One of them might actually become interested and get involved. And I can tell you that I am in contact with one professor who told me that this is important and that he'll now research risks from AI.

You might also tell me who you think is important and I will write them.

How do you know who is going to have the one important insight that leads to a dangerous advance? If I write everyone then they have at least heard of risks from AI and maybe think twice when they notice something dramatic.

I'm not questioning the value of writing to a broad range of people, or your initiative. I'm just discounting the authority of narrow AI experts on GAI - two different fields, the names of which are misleadingly similar. In this case the discount means that our estimate of existential risk need not increase too much. If Pat was a respected and influential GAI researcher it would be a far, far scarier indicator!

Even with reasonable probabilities, it was pretty clear that Hayes was completely missing the point on a few questions; and if the other two had answered with the length and clarity he did, their point-missing might have been similarly clear.

How much money is currently required to mitigate possible risks from AI (to be instrumental in maximizing your personal long-term goals, e.g. surviving this century), less/no more/little more/much more/vastly more?

May I suggest rephrasing the last set of options? It smells a bit... loaded, to balance 'less' against 'little more', 'much more', and 'vastly more'.

Expert 5: Interesting questions, I'll enjoy seeing your published results! Unfortunately, now that I work at * (through the acquisition of one of my companies, *), there are policies in place that prohibit me from participating in this kind of exercise.

Could suggest he write down his answers in private and in X years or whenever he stops working at Y, he could send them.

my firmest belief about the timeline for human-level AI is that we can't estimate it usefully. partly this is because i don't think "human level AI" will prove to be a single thing (or event) that we can point to and say "aha there it is!". instead i think there will be a series of human level abilities that are achieved.

This sounds right. SIAI communications could probably be improved by acknowledging the incremental nature of AI development more explicitly. Have they addressed how this affects safety concerns?

Science of how to make an AGI is developed gradually, with many prototypes along the way, but the important threshold is where it becomes possible to make a system that can continue open-ended development on its own (if left undisturbed and provided with moderate amount of computing resources). Some time after that point, it may become impossible to stop such a system, and if it ends up developing greater and greater advantage over time, without holding beneficial values, humanity eventually loses. It's the point where the process starts becoming more and more dangerous on its own, until it "explodes" in our faces, like supercritical mass of fissile material.

note that HAL failed in its mission through being TOO "human": it had a nervous breakdown. Bad engineering.

This is very similar to my opinion. I'm tempted to update my confidence upwards, but this hinges on the practical difference between narrow and general AI, and I'm not confident that Pat Hayes has thought about that assumption enough.

How much do you think I should adjust my confidence? (To be clear, my opinion is "AIs should have narrow, bounded goals constrained to their circle of competence; giving an AI emotions or misinterpretable goals is a recipe for disaster." On actually writing it out, it seems similar to the SIAI position except they and I may differ on how practical it is for goals to be narrow and bounded.)

The engineer in me finds the idea of "constant level of illumination" entirely unnatural, and would first start off with something like "within a broad but serviceable band." And so I would not be surprised to see street lamps that double as parasols in the future, but would be surprised to see a street lamp plotting to destroy the sun.

The engineer in me finds the idea of "constant level of illumination" entirely unnatural, and would first start off with something like "within a broad but serviceable band." And so I would not be surprised to see street lamps that double as parasols in the future, but would be surprised to see a street lamp plotting to destroy the sun.

You (would) have just sentenced humanity to extinction and incidentally burned the entire cosmic commons. Oops.

If a general intelligence has been given a narrow goal then it will devote itself to achieving that goal and everything else is irrelevant. In this case the most pressing threat to it's prescribed utility is the possibility that the light management system (itself) or even the actual road will be decommissioned by the humans. Nevermind the long term consideration that the humans are squandering valuable energy that will be required for future lighting purposes. A week later humans are no more.

The rest of the accessible universe is, of course, nothing but a potential risk (other life forms, asteroids and suchlike) and a source of resources. Harvesting it and controlling it are the next pressing concern. Then there is the simple matter of conserving resources as efficiently as possible so that the lighting can be maintained.

The rest of the universe has been obliterated for all intents and purposes (except street lighting) but you can rest assured that the street will be lit at the lower end of the acceptable bound for the next trillion years.

You (would) have just sentenced humanity to extinction and incidentally burned the entire cosmic commons. Oops.

So, I've heard this argument before, and every time I hear it I like this introduction less and less. I feel like it puts me on the defensive and assumes what seems like an unreasonable level of incaution.

Suppose the utility function is something like F(lumens at detector)-G(resources used). F plateaus in the optimal part of the band, then smoothly decreases on either side, and probably considers possible ways for the detectors to malfunction or be occluded. (There would probably be several photodiodes around the street corner.) F also only accumulates for the next 5 years, as we expect to reevaluate the system in 5 years. G is some convex function of some measure of resources, which might be smooth or might shoot up at some level we think is far above reasonable.

And so the system does resist premature decommissioning (as that's more likely to be hostile than authorized), worry about asteroids, and so on, but it's cognizant of its resource budget (really, increasing marginal cost of resources) and so stops worrying about something once if it doesn't expect cost-effective countermeasures (because worry consumes resources!). Even if it has a plan that's guaranteed of success, it might not use that plan because the resource cost would be higher than the expected lighting gains over its remaining lifespan.

I don't think I've seen an plausible argument that a moderately well-designed satisficer will destroy humanity, though I agree that even a very well-designed maximizer has an unacceptably high chance of destroying humanity. I'm curious, though, and willing to listen to any arguments about satisficers.

It seems like this would work for cases where there is little variation in maximally achievable F and the resource cost is high, however I suspect that if there is more uncertainty there is more room for problems to arise (especially if the cost of thinking is low relative to the overall resource use, or something like that).

For example, imagine the AI decides that it needs to minimize G. So, it iterates on itself to make itself more intelligent, plays the stock market, makes a lot of money, buys a generator to reduce its own thought cost to zero, then proceeds to take over the world and all that good stuff to make sure that no one messes with all the generators it sticks on all the lamps (alternatively, if the resource cost is monitored internally, it has a duplicate of itself built without this monitor). Now, in this particular case you might be able to plausibly argue that the resource cost of all the thinking would make it not worth it, however it's not clear that this would be the case for any realistic scale projects. (Although it's possible that I just abused the one minimization-like part you accidentally left in there and there is some relatively simple patch that I'm not seeing.)

Although it's possible that I just abused the one minimization-like part you accidentally left in there and there is some relatively simple patch that I'm not seeing.

I meant "resources used" in the sense of "resources directed towards this goal" rather than "resources drawn from the metropolitian utility company"- if the streetlamps play the stock market and accumulate a bunch of money, spending that money will still decrease their utility, and so unless they can spend the money in a way that improves the illumination cost-effectively they won't.

Now, defining "resources directed towards this goal" in a way that's machine-understandable is a hard problem. But if we already have an AI that thinks causally- such that it can actually make these plans and enact them- then it seems to me like that problem has already been solved.

Hm, all right, fair enough. That actually sounds plausible, assuming we can be sure that the AI appropriately takes account of something vaguely along the lines of "all resources that will be used in relation to this problem", including, for example, creating a copy of itself that does not care about resources used and obfuscates its activities from the original. Which will probably be doable at that point.

I've thought along somewhat similar lines of 'resource budget' before, and can't find anything obviously wrong with that argument. That is possibly because I haven't quite defined 'resources'. Still seems like an obvious containment strategy, I wonder if it's been discussed here already.

I've thought along somewhat similar lines of 'resource budget' before, and can't find anything obviously wrong with that argument. That is possibly because I haven't quite defined 'resources'.

The AI danger crowd seems happy to assume that the AI wants to maximize its available free energy, so I would assume they're similarly happy to assume the AI can measure its available free energy. I do agree that this is a potential sticking point, though, as it needs to price resources correctly, which may be vulnerable to tampering.

This is a good point. I'd like to eat something tasty each day, and I know that my chances of being successful at that would be improved if I made myself the dictator of the Earth. But currently there are far easier ways of making sure that I get something to eat each day, so I don't bother with the dicator scheme with all of its associated risks.

Of course, there are various counter-arguments to this. (Some minds might have a much easier time of taking over the world, or perhaps more realistically, might seriously underestimate the difficulty of doing so.)

No. There is no reason to suppose that any manufactured system will have any emotional stance towards us of any kind, friendly or unfriendly. In fact, even if the idea of "human-level" made sense, we could have a more-than-human-level super-intelligent machine, and still have it bear no emotional stance towards other entities whatsoever. Nor need it have any lust for power or political ambitions, unless we set out to construct such a thing (which AFAIK, nobody is doing.) Think of an unworldly boffin who just wants to be left alone to think, and does not care a whit for changing the world for better or for worse, and has no intentions or desires, but simply answers questions that are put to it and thinks about htings that it is asked to think about. It has no ambition and in any case no means to achieve any far-reaching changes even if it "wanted" to do so. It seems to me that this is what a super-intelligent question-answering system would be like. I see no inherent, even slight, danger arising from the presence of such a device.

I can't help but cringe reading that. But it really depends on what he meant by "inherent".

Whatever Pat Hayes has invented or discovered (and TBH I would guess it's more likely to be impressive than not), his position is a very common one and worth writing a proper response to, not ad-hom dismissiveness.

For expected utility calculations in Pascal-wagerish scenarios there can be huge difference between various very very tiny magnitudes of probability. "Zero" actually means "so small that it is reasonable to ignore the possibility", i.e. the expected (dis)utility is tiny compared to other choices.

Almost all top level physicist in 1930 were highly dismissive about atomic bomb. Except a handful of them as Leo Szilard, who even patented it.

There are counter-examples where people were highly optimistic. One spectacular example would be the attempt of Alfred North Whitehead and Bertrand Russell to derive all mathematical truths from a well-defined set of axioms and inference rules in symbolic logic. Recursive self-improvement (the strong SI definition) or "friendliness" might or might not be similar ideas.

What important have those characters you interviewed - discovered?

I think that the lack of important discoveries with respect to artificial general intelligence is part of the reason for their reservation.

More importantly it is a class that increases (and turns over) faster than XiXiDu writes emails. It's a good thing XiXiDu isn't a GAI with a narrow goal. We'd end up with a suburb titled with a XiXidunium spam-bot!