Pascal’s Mugging: Tiny Probabilities of Vast Utilities

The most com­mon for­mal­iza­tions of Oc­cam’s Ra­zor, Solomonoff in­duc­tion and Min­i­mum De­scrip­tion Length, mea­sure the pro­gram size of a com­pu­ta­tion used in a hy­poth­e­sis, but don’t mea­sure the run­ning time or space re­quire­ments of the com­pu­ta­tion. What if this makes a mind vuln­er­a­ble to finite forms of Pas­cal’s Wager? A com­pactly speci­fied wa­ger can grow in size much faster than it grows in com­plex­ity. The util­ity of a Tur­ing ma­chine can grow much faster than its prior prob­a­bil­ity shrinks.

In other words: 3^^^3 de­scribes an ex­po­nen­tial tower of threes 7625597484987 lay­ers tall. Since this num­ber can be com­puted by a sim­ple Tur­ing ma­chine, it con­tains very lit­tle in­for­ma­tion and re­quires a very short mes­sage to de­scribe. This, even though writ­ing out 3^^^3 in base 10 would re­quire enor­mously more writ­ing ma­te­rial than there are atoms in the known uni­verse (a paltry 10^80).

Now sup­pose some­one comes to me and says, “Give me five dol­lars, or I’ll use my magic pow­ers from out­side the Ma­trix to run a Tur­ing ma­chine that simu­lates and kills 3^^^^3 peo­ple.”

Call this Pas­cal’s Mug­ging.

“Magic pow­ers from out­side the Ma­trix” are eas­ier said than done—we have to sup­pose that our world is a com­put­ing simu­la­tion run from within an en­vi­ron­ment that can af­ford simu­la­tion of ar­bi­trar­ily large finite Tur­ing ma­chines, and that the would-be wiz­ard has been spliced into our own Tur­ing tape and is in con­tin­u­ing com­mu­ni­ca­tion with an out­side op­er­a­tor, etc.

Thus the Kol­mogorov com­plex­ity of “magic pow­ers from out­side the Ma­trix” is larger than the mere English words would in­di­cate. There­fore the Solomonoff-in­ducted prob­a­bil­ity, two to the nega­tive Kol­mogorov com­plex­ity, is ex­po­nen­tially tinier than one might naively think.

But, small as this prob­a­bil­ity is, it isn’t any­where near as small as 3^^^^3 is large. If you take a dec­i­mal point, fol­lowed by a num­ber of ze­ros equal to the length of the Bible, fol­lowed by a 1, and mul­ti­ply this uni­mag­in­ably tiny frac­tion by 3^^^^3, the re­sult is pretty much 3^^^^3.

Most peo­ple, I think, en­vi­sion an “in­finite” God that is nowhere near as large as 3^^^^3. “In­finity” is re­as­sur­ingly fea­ture­less and blank. “Eter­nal life in Heaven” is nowhere near as in­timi­dat­ing as the thought of spend­ing 3^^^^3 years on one of those fluffy clouds. The no­tion that the di­ver­sity of life on Earth springs from God’s in­finite cre­ativity, sounds more plau­si­ble than the no­tion that life on Earth was cre­ated by a su­per­in­tel­li­gence 3^^^^3 bits large. Similarly for en­vi­sion­ing an “in­finite” God in­ter­ested in whether women wear men’s cloth­ing, ver­sus a su­per­in­tel­li­gence of 3^^^^3 bits, etc.

The origi­nal ver­sion of Pas­cal’s Wager is eas­ily dealt with by the gi­gan­tic mul­ti­plic­ity of pos­si­ble gods, an Allah for ev­ery Christ and a Zeus for ev­ery Allah, in­clud­ing the “Pro­fes­sor God” who places only athe­ists in Heaven. And since all the ex­pected util­ities here are allegedly “in­finite”, it’s easy enough to ar­gue that they can­cel out. In­fini­ties, be­ing fea­ture­less and blank, are all the same size.

But sup­pose I built an AI which worked by some bounded analogue of Solomonoff in­duc­tion—an AI suffi­ciently Bayesian to in­sist on calcu­lat­ing com­plex­ities and as­sess­ing prob­a­bil­ities, rather than just wav­ing them off as “large” or “small”.

If the prob­a­bil­ities of var­i­ous sce­nar­ios con­sid­ered did not ex­actly can­cel out, the AI’s ac­tion in the case of Pas­cal’s Mug­ging would be over­whelm­ingly dom­i­nated by what­ever tiny differ­en­tials ex­isted in the var­i­ous tiny prob­a­bil­ities un­der which 3^^^^3 units of ex­pected util­ity were ac­tu­ally at stake.

You or I would prob­a­bly wave off the whole mat­ter with a laugh, plan­ning ac­cord­ing to the dom­i­nant main­line prob­a­bil­ity: Pas­cal’s Mug­ger is just a philoso­pher out for a fast buck.

But a sili­con chip does not look over the code fed to it, as­sess it for rea­son­able­ness, and cor­rect it if not. An AI is not given its code like a hu­man ser­vant given in­struc­tions. An AI is its code. What if a philoso­pher tries Pas­cal’s Mug­ging on the AI for a joke, and the tiny prob­a­bil­ities of 3^^^^3 lives be­ing at stake, over­ride ev­ery­thing else in the AI’s calcu­la­tions? What is the mere Earth at stake, com­pared to a tiny prob­a­bil­ity of 3^^^^3 lives?

How do I know to be wor­ried by this line of rea­son­ing? How do I know to ra­tio­nal­ize rea­sons a Bayesian shouldn’t work that way? A mind that worked strictly by Solomonoff in­duc­tion would not know to ra­tio­nal­ize rea­sons that Pas­cal’s Mug­ging mat­tered less than Earth’s ex­is­tence. It would sim­ply go by what­ever an­swer Solomonoff in­duc­tion ob­tained.

It would seem, then, that I’ve im­plic­itly de­clared my ex­is­tence as a mind that does not work by the logic of Solomonoff, at least not the way I’ve de­scribed it. What am I com­par­ing Solomonoff’s an­swer to, to de­ter­mine whether Solomonoff in­duc­tion got it “right” or “wrong”?

Why do I think it’s un­rea­son­able to fo­cus my en­tire at­ten­tion on the magic-bear­ing pos­si­ble wor­lds, faced with a Pas­cal’s Mug­ging? Do I have an in­stinct to re­sist ex­ploita­tion by ar­gu­ments “any­one could make”? Am I un­satis­fied by any vi­su­al­iza­tion in which the dom­i­nant main­line prob­a­bil­ity leads to a loss? Do I drop suffi­ciently small prob­a­bil­ities from con­sid­er­a­tion en­tirely? Would an AI that lacks these in­stincts be ex­ploitable by Pas­cal’s Mug­ging?

Is it me who’s wrong? Should I worry more about the pos­si­bil­ity of some Unseen Mag­i­cal Prankster of very tiny prob­a­bil­ity tak­ing this post liter­ally, than about the fate of the hu­man species in the “main­line” prob­a­bil­ities?

It doesn’t feel to me like 3^^^^3 lives are re­ally at stake, even at very tiny prob­a­bil­ity. I’d sooner ques­tion my grasp of “ra­tio­nal­ity” than give five dol­lars to a Pas­cal’s Mug­ger be­cause I thought it was “ra­tio­nal”.

Should we pe­nal­ize com­pu­ta­tions with large space and time re­quire­ments? This is a hack that solves the prob­lem, but is it true? Are com­pu­ta­tion­ally costly ex­pla­na­tions less likely? Should I think the uni­verse is prob­a­bly a coarse-grained simu­la­tion of my mind rather than real quan­tum physics, be­cause a coarse-grained hu­man mind is ex­po­nen­tially cheaper than real quan­tum physics? Should I think the galax­ies are tiny lights on a painted back­drop, be­cause that Tur­ing ma­chine would re­quire less space to com­pute?

Given that, in gen­eral, a Tur­ing ma­chine can in­crease in util­ity vastly faster than it in­creases in com­plex­ity, how should an Oc­cam-abid­ing mind avoid be­ing dom­i­nated by tiny prob­a­bil­ities of vast util­ities?

If I could for­mal­ize whichever in­ter­nal crite­rion was tel­ling me I didn’t want this to hap­pen, I might have an an­swer.

I talked over a var­i­ant of this prob­lem with Nick Hay, Peter de Blanc, and Mar­cello Her­reshoff in sum­mer of 2006. I don’t feel I have a satis­fac­tory re­s­olu­tion as yet, so I’m throw­ing it open to any an­a­lytic philoso­phers who might hap­pen to read Over­com­ing Bias.

Why would not giv­ing him $5 make it more likely that peo­ple would die, as op­posed to less likely? The two would seem to can­cel out. It’s the same old “what if we are liv­ing in a simu­la­tion?” ar­gu­ment- it is, at least, pos­si­ble that me hit­ting the se­quence of let­ters “QWERTYUIOP” leads to a near-in­finity of death and suffer­ing in the “real world”, due to AGI over­lords with wacky pro­gram­ming. Yet I do not re­frain from hit­ting those let­ters, be­cause there’s no en­tan­gle­ment which drives the prob­a­bil­ities in that di­rec­tion as op­posed to some other ran­dom di­rec­tion; my ac­tions do not al­ter the ex­pected fu­ture state of the uni­verse. You could just as eas­ily wind up sav­ing lives as kil­ling peo­ple.

The mug­ger claims to not be a ‘per­son’ in the con­ven­tional sense, but rather an en­tity with out­side-Ma­trix pow­ers. If this state­ment is true, then gen­er­al­ized ob­ser­va­tions about the refer­ence class of ‘peo­ple’ can­not nec­es­sar­ily be con­sid­ered ap­pli­ca­ble.

Con­versely, if it is false, then this is not a ran­domly-se­lected per­son, but rather some­one who has started off the con­ver­sa­tion with an out­ra­geous profit-mo­ti­vated lie, and as such can­not be trusted.

I am not con­vinced that, even among hu­mans speak­ing to other hu­mans, truth-tel­ling can be as­sumed when there is such a blatantly ob­vi­ous in­cen­tive to lie.

I mean, say there ac­tu­ally is some­one who can de­stroy vast but cur­rently-un­ob­serv­able pop­u­la­tions with less effort than it would take them to earn $5 with con­ven­tional eco­nomic ac­tivity, and the eth­i­cal calcu­lus works out such that you’d be bet­ter served to pay them $5 than let it hap­pen. At that point, aren’t they bet­ter served to ex­ag­ger­ate their de­struc­tive ca­pac­ity by an or­der of mag­ni­tude or two, and ask you for $6? Or $10?

Once the num­ber the mug­ger quotes ex­ceeds your abil­ity to in­de­pen­dently con­firm, or even prop­erly imag­ine, the num­ber it­self be­comes ir­rele­vant. It’s ei­ther a dis­play of in­com­pre­hen­si­bly over­whelming force, to which you must sub­mit ut­terly or be de­stroyed, or a bluff you should ig­nore.

There is no blatantly ob­vi­ous rea­son to want to tor­ture the peo­ple only if you do give him money.

At that point, aren’t they bet­ter served to ex­ag­ger­ate their de­struc­tive ca­pac­ity by an or­der of mag­ni­tude or two, and ask you for $6? Or $10?

So, you’re say­ing that the prob­lem is that, if they re­ally were go­ing to kill 3^^^3 peo­ple, they’d lie? Why? 3^^^3 isn’t just enough to get $5. It’s enough that the ex­pected se­ri­ous­ness of the threat is uni­mag­in­ably large.

Look at it this way: If they’re go­ing to lie, there’s no rea­son to ex­ag­ger­ate their de­struc­tive ca­pac­ity by an or­der of mag­ni­tude when they can just make up a num­ber. If they choose to make up a num­ber, 3^^^3 is plenty high. As such, if it re­ally is 3^^^3, they might as well just tell the truth. If there’s any chance that they’re not ly­ing given that they re­ally can kill 3^^^3 peo­ple, their threat is valid. It’s one thing to be 99.9% sure they’re ly­ing, but here, a 1 − 1/​sqrt(3^^^3) cer­tainty that they’re ly­ing still gives more than enough doubt for an uni­mag­in­ably large threat.

It’s ei­ther a dis­play of in­com­pre­hen­si­bly over­whelming force, to which you must sub­mit ut­terly or be de­stroyed, or a bluff you should ig­nore.

You’re not psy­chic. You don’t know which it is. In this case, the risk of the former is enough to over­whelm the larger prob­a­bil­ity of the lat­ter.

Let’s say you’re a so­ciopath, that is, the only fac­tors in your util­ity func­tion are your own per­sonal se­cu­rity and hap­piness. Two un­re­lated peo­ple ap­proach you si­mul­ta­neously, one car­ry­ing a home­made sin­gle-shot small-cal­iber pis­tol (a ‘zip gun’) and the other ap­par­ently un­armed. Both of them, sep­a­rately, de­mand $10 in ex­change for not kil­ling you im­me­di­ately. You’ve got a $20 bill in your wallet; the un­armed mug­ger, upon learn­ing this, obligingly offers to make change. While he’s thus dis­tracted, you pro­pose to the mug­ger with the zip gun that he shoot the un­armed mug­ger, and that the two of you then split the pro­ceeds. The mug­ger with the zip­gun re­fuses, ex­plain­ing that the un­armed mug­ger claims to be close per­sonal friends with a pro­fes­sional sniper, who is most likely ob­serv­ing this situ­a­tion from a few hun­dred yards away through a telescopic sight and would re­tal­i­ate against any­one who hurt her friend the mug­ger. The mug­ger with the zip gun has never ac­tu­ally met the sniper or di­rectly ob­served her hand­i­work, but is suffi­ciently de­tered by ru­mor alone.

If you don’t pay the zip-gun mug­ger, you’ll definitely get shot at, but only once, and with good chances of a miss or non­fatal in­jury. If you don’t pay the un­armed mug­ger, and the sniper is real, you will al­most cer­tainly die be­fore you can de­ter­mine her po­si­tion or get be­hind suffi­ciently hard cover. If you pay them both, you will have to walk home through a bad part of town at night in­stead of tak­ing the quicker-and-safer bus, which apart from the in­con­ve­nience might re­sult in you be­ing mugged a third time.

How would you re­spond to that?

I don’t need to be psy­chic. I just do the math. Tak­ing any sort of in­finites­si­mally-un­likely threat so se­ri­ously that it dom­i­nates my de­ci­sion­mak­ing means any­one can yank my chain just by mak­ing a few un­founded as­ser­tions in­volv­ing big enough num­bers, and then once word gets around, the world will no longer con­tain ac­cept­able out­comes.

Not so much an an­swer to the prob­lem, but a clue to the rea­son WHY we in­tu­itively, as hu­mans, know to re­spond in a way which seems un-math­e­mat­i­cal.

It seems like a Game The­ory prob­lem to me. Here, we’re call­ing the op­po­nents’ bluff. If we make the de­ci­sion that SEEMINGLY MAXIMIZES OUR UTILITY, ac­cord­ing to game the­ory we’re set up for a world of hurt in terms of in­definite situ­a­tions where we can be taken ad­van­tage of. Game The­ory already con­tains lots of situ­a­tions where rea­sons ex­ist to take ac­tion that seem­ingly does not max­i­mize your own util­ity.

In your ex­am­ple, only you die. In Pas­cal’s mug­ging, it’s uni­mag­in­ably worse.

Do you ac­cept that, in the cir­cum­stance you gave, you are more likely to be shot by a sniper if you only pay one mug­ger? Not sig­nifi­cantly more likely, but still more likely? If so, that’s analo­gous to ac­cept­ing that Pas­cal’s mug­ger will be more likely to make good on his threat if you don’t pay.

In my ex­am­ple, the per­son mak­ing the de­ci­sion was speci­fied to be a so­ciopath, for whom there is no con­ceiv­able worse out­come than the to­tal loss of per­sonal iden­tity and agency as­so­ci­ated with death.

The two mug­gers are in­differ­ent to each other’s suc­cess. You could pay off the un­armed mug­ger to elimi­nate the risk of be­ing sniped (by that par­tic­u­lar mug­ger’s friend, at least, if she ex­ists; there may well be other snipers el­se­where in town with un­re­lated agen­das, about whom you have even less in­for­ma­tion) and ac­cept the risk of be­ing shot with the zip gun, in or­der to af­ford the quicker, safer bus ride home. In that case you would only be pay­ing one mug­ger, and still have the low­est pos­si­ble sniper-re­lated risk.

The three pos­si­ble ex­penses were meant as metaphors for ex­is­ten­tial risk miti­ga­tion (imag­i­nary sniper), in­fras­truc­ture de­vel­op­ment (bus), and mil­i­tary/​se­cu­rity de­vel­op­ment (zip gun), the lat­ter two form­ing the clas­sic guns-or-but­ter eco­nomic dilemma. His­tor­i­cally speak­ing, so­cieties that put too much em­pha­sis, too many re­sources, to­ward pre­vent­ing low-prob­a­bil­ity high-im­pact dis­asters, such as di­v­ine wrath, ended up suc­cumb­ing to com­par­a­tively ba­nal things like famine, or pillag­ing by shorter-sighted neigh­bors. What use is a math­e­mat­i­cal model of util­ity that would steer us into those same mis­takes?

Is your prob­lem that we’d have to keep the five dol­lars in case of an­other mug­ger? I’d hardly con­sider the idea of steer­ing our life around pas­cal’s mug­ging to be dis­agree­ing with it. For what it’s worth, if you look for hy­po­thet­i­cal pas­cal’s mug­gings, ex­pected util­ity doesn’t con­verge and de­ci­sion the­ory breaks down.

Odd, I’ve been read­ing moral para­doxes for many years and my brain never crashed once, nor have I turned evil. I’ve been con­fused but never catas­troph­i­cally so (though I have to ad­mit my younger self came close). My al­gorithm must be “be­yond clever”.

That’s a re­mark­able level of re­silience for a brain de­sign which is, speak­ing pro­fes­sion­ally, a damn ugly mess. If I can’t do as­pire to do at least that well, I may as well hang up my shin­gle and move in with the ducks.

The mod­ern hu­man ner­vous sys­tem is the re­sult of up­wards of a hun­dred thou­sand years of bru­tal field-test­ing. The ba­sic com­po­nents, and even whole sub­mod­ules, can be traced back even fur­ther. A cer­tain amount of re­siliency is to be ex­pected. If you want to start from scratch and as­pire to the same or higher stan­dards of perfor­mance, it might be sen­si­ble to be pre­pared to in­vest the same amount of time and cap­i­tal that the BIG did.

That you have not yet been crip­pled by a moral para­dox or other stan­dard rhetor­i­cal trick is com­pa­rable to say­ing that a server re­mains se­cure af­ter a child spent an af­ter­noon pok­ing around with it and try­ing out lists of de­fault pass­words: a good sign, cer­tainly, and a test many would fail, but not in it­self proof of perfec­tion.

In­deed, on a list of things we can ex­pect evolved brains to be, ROBUST is very high on the list. (“ra­tio­nal” is ac­tu­ally rather hard to come by. To some de­gree, ra­tio­nal­ity im­proves fit­ness. But of­ten its cost out­weighs its benefit, hence the sea slug.)

Ad­di­tion­ally, peo­ple throw away prob­lems if they can’t solve the an­swer or if get­ting the speci­fics of the an­swer are be­yond their limits. A badly de­signed AI sys­tem wouldn’t have that op­tion and so would be par­a­lyzed by calcu­la­tion.

I agree with the com­menter above who said the best thing to stop any­thing like this from hap­pen­ing is an AI sys­tem with checks and bal­ances which au­to­mat­i­cally throws out cer­tain prob­lems. In the ab­stract, that might con­ceiv­ably be bad. In the real world it prob­a­bly won’t be. Prob­a­bly isn’t very in­spiring or log­i­cally com­pel­ling but I think it’s the best that we can do.

Un­less we de­sign the first AI sys­tem with a com­plex goal sys­tem ori­ented around fix­ing it­self that ba­si­cally boils down to “do your best to find and solve any prob­lems or con­tra­dic­tions within your sys­tem, ask for our help when­ever you are un­sure of an an­swer, then de­sign a com­puter which can do the same task bet­ter than you, etc, then have the fi­nal com­puter be­gin the ac­tual work of an AI”. The thought comes from Dou­glas Adams’ Hitch­hiker books, I for­get the names of the com­put­ers but it doesn’t mat­ter.

To any­one who says it’s im­pos­si­ble or un­fea­si­ble to im­ple­ment some­thing like this: note that hav­ing one bi­ased com­puter at­tempt to cor­rect its own bi­ases and cre­ate a less bi­ased com­puter is in all rele­vant ways equiv­a­lent to hav­ing one bi­ased hu­man at­tempt to cor­rect its own bi­ases and cre­ate a less bi­ased com­puter.

Tom and An­drew, it seems very im­plau­si­ble that some­one say­ing “I will kill 3^^^^3 peo­ple un­less X” is liter­ally zero Bayesian ev­i­dence that they will kill 3^^^^3 peo­ple un­less X. Though I guess it could plau­si­bly be weak enough to take much of the force out of the prob­lem.

An­drew, if we’re in a simu­la­tion, the world con­tain­ing the simu­la­tion could be able to sup­port 3^^^^3 peo­ple. If you knew (mag­i­cally) that it couldn’t, you could sub­sti­tute some­thing on the or­der of 10^50, which is vastly less force­ful but may still lead to the same prob­lem.

An­drew and Steve, you could re­place “kill 3^^^^3 peo­ple” with “cre­ate 3^^^^3 units of di­su­til­ity ac­cord­ing to your util­ity func­tion”. (I re­spect­fully sug­gest that we all start us­ing this form of the prob­lem.)

Michael Vas­sar has sug­gested that we should con­sider any num­ber of iden­ti­cal lives to have the same util­ity as one life. That could be a solu­tion, as it’s im­pos­si­ble to cre­ate 3^^^^3 dis­tinct hu­mans. But, this also is ir­rele­vant to the cre­ate-3^^^^3-di­su­til­ity-units form.

IIRC, Peter de Blanc told me that any con­sis­tent util­ity func­tion must have an up­per bound (mean­ing that we must dis­count lives like Steve sug­gests). The prob­lem dis­ap­pears if your up­per bound is low enough. Hope­fully any re­al­is­tic util­ity func­tion has such a low up­per bound, but it’d still be a good idea to solve the gen­eral prob­lem.

I see a similar­ity to the po­lice chief ex­am­ple. Adopt­ing a policy of pay­ing at­ten­tion to any Pas­calian mug­gings would en­courage oth­ers to ma­nipu­late you us­ing them. At first it doesn’t seem like this would have nearly enough di­su­til­ity to jus­tify ig­nor­ing mug­gings, but it might when you con­sider that it would in­terfere with re­spond­ing to any real threat (un­likely as it is) of 3^^^^3 deaths.

If your util­ity func­tion as­signs val­ues to out­comes that differ by a fac­tor of X, then you are vuln­er­a­ble to be­com­ing a fa­natic who banks on sce­nar­ios that only oc­cur with prob­a­bil­ity 1/​X. As sim­ple as that.

If you think that bank­ing on sce­nar­ios that only oc­cur with prob­a­bil­ity 1/​X is silly, then you have im­plic­itly re­vealed that your util­ity func­tion only as­signs val­ues in the range [1,Y], where Y<X, and where 1 is the low­est util­ity you as­sign.

If you think that bank­ing on sce­nar­ios that only oc­cur with prob­a­bil­ity 1/​X is silly, then you have im­plic­itly re­vealed that your util­ity func­tion only as­signs val­ues in the range [1,Y], where Y<X, and where 1 is the low­est util­ity you as­sign.

… or your judg­ments of silli­ness are out of line with your util­ity func­tion.

Michael Vas­sar has sug­gested that we should con­sider any num­ber of iden­ti­cal lives to have the same util­ity as one life. That could be a solu­tion, as it’s im­pos­si­ble to cre­ate 3^^^^3 dis­tinct hu­mans. But, this also is ir­rele­vant to the cre­ate-3^^^^3-di­su­til­ity-units form.

What if we re­quired that the util­ity func­tion grow no faster than the Kol­mogorov com­plex­ity of the sce­nario? This seems like a suit­able gen­er­al­iza­tion of Vas­sar’s pro­posal.

It seems to me that the can­cel­la­tion is an ar­ti­fact of the par­tic­u­lar ex­am­ple, and that it would be easy to come up with an ex­am­ple in which the can­cel­la­tion does not oc­cur. For ex­am­ple, maybe you have pre­vi­ous ex­pe­rience with the mug­ger. He has mugged you be­fore about minor things and some­times you have paid him and some­times not. In all cases he has been true to his word. This would seem to tip the prob­a­bil­ities at least slightly in fa­vor of him be­ing truth­ful about his cur­rent much larger threat.

Even in that case I would as­sign enor­mously higher prob­a­bil­ity to the hy­poth­e­sis that my dead­beat pal has caught some sort of brain dis­ease that re­sults in com­pul­sive ly­ing, than that such a per­son has some­how ac­quired re­al­ity-break­ing pow­ers but still has noth­ing bet­ter to do than hit me up for spare change.

I don’t know—if he did ac­tu­ally have re­al­ity break­ing pow­ers, he would likely be tempted to put them to more effec­tive use. If he would in fact be less likely to be mak­ing the state­ment were it true, then it is ev­i­dence against, not ev­i­dence for, the truth of his state­ment.

I think you’re as­sum­ing that to give in to the mug­ging is the wrong an­swer in a one-shot game for a be­ing that val­ues all hu­mans in ex­is­tence equally, be­cause it feels wrong to you, a be­ing with a moral com­pass evolved in iter­ated multi-gen­er­a­tional games.

Con­sider these pos­si­bil­ities, any one of which would cre­ate challenges for your rea­son­ing:

1. Giv­ing in is the right an­swer in a one-shot game, but the wrong an­swer in an iter­ated game. If you give in to the mug­ging, the out­sider will keep mug­ging you and other ra­tio­nal­ists un­til you’re all broke, leav­ing the uni­verse’s fu­ture in the hands of “Marx­i­ans” and post-mod­ernists.

2. Giv­ing in is the right an­swer for a ra­tio­nal AI God, but evolved be­ings (un­der the Dar­wi­nian defi­ni­tion of “evolved”) can’t value all mem­ber of their species equally. They must value kin more than strangers. You would need a the­ory to ex­plain why any be­ing that evolved due to re­source com­pe­ti­tion wouldn’t con­sider kil­ling a large num­ber of very dis­tantly-re­lated mem­bers of its species to be a good thing.

3. You should in­ter­pret the con­flict be­tween your in­tu­ition, and your de­sire for a ra­tio­nal God, not as show­ing that you’re rea­son­ing badly be­cause you’re evolved, but that you’re rea­son­ing badly by de­siring a ra­tio­nal God bound by a static util­ity func­tion. This is com­pli­cated, so I’m gonna need more than one para­graph:

In­tu­itively, my ar­gu­ment boils down to ap­ply­ing the logic be­hind free mar­kets, free­dom of speech, and es­pe­cially evolu­tion, to the ques­tion of how to con­struct God’s util­ity func­tion. This will be vague, but I think you can fill in the blanks.

Free-mar­ket eco­nomic the­ory de­vel­oped only af­ter mil­le­nia dur­ing which ev­ery­one be­lieved that top-down con­trol was the best way of al­lo­cat­ing re­sources. Free­dom of speech de­vel­oped only af­ter mil­le­nia dur­ing which ev­ery­one be­lieved that it was ra­tio­nal for ev­ery­one to try to sup­press any speech they dis­agreed with. Poli­ti­cal liber­al­ism de­vel­oped only af­ter mil­le­nia dur­ing which ev­ery­body be­lieved that the best way to re­form so­ciety was to figure out what the best so­ciety would be like, then force that on ev­ery­one. Evolu­tion was con­ceived of—well, origi­nally about 2500 years ago, prob­a­bly by Dem­ocri­tus, but it be­came pop­u­lar only af­ter mil­le­nia dur­ing which ev­ery­one be­lieved that life could be cre­ated only by de­sign.

All of these de­vel­op­ments came from em­piri­cists. Em­piri­cism is one of the two op­pos­ing philo­soph­i­cal tra­di­tions of Western thought. It origi­nated, as far as we know, with Dem­ocri­tus (about whom Plato re­port­edly said that he wished all his works to be burned—which they even­tu­ally were). It went through the Skep­tics, the Sto­ics, Lu­cretius, nom­i­nal­ism, the use of nu­meric mea­sure­ments (re-in­tro­duced to the West circa 1300), the Re­nais­sance and En­light­en­ment, and even­tu­ally (with the ad­di­tion of evolu­tion, prob­a­bil­ity, statis­tics, and op­er­a­tional­ized terms) cre­ated mod­ern sci­ence.

A key prin­ci­ple of em­piri­cism, on which John Stu­art Mill ex­plic­itly based his defense of free speech, is that we can never be cer­tain. If you read about the skep­tics and sto­ics to­day, you’ll read that they “be­lieved noth­ing”, but that was be­cause, to their op­po­nents, “be­lieve” meant “know some­thing with 100% cer­tainty”.

(The most-fa­mous skep­tic, Sex­tus Em­piri­cus, was called “Em­piri­cus” be­cause he was of the em­piri­cal school of medicine, which taught learn­ing from ex­pe­rience. Its op­po­nent was the ra­tio­nal school of medicine, which used logic to in­ter­pret the dic­tums of the an­cient au­thor­i­ties.)

The op­pos­ing philo­soph­i­cal tra­di­tion, founded by Plato—is ra­tio­nal­ism. “Ra­tional” does not mean “good think­ing”. It has a very spe­cific mean­ing, and it is not a good way of think­ing. It means rea­son­ing about the phys­i­cal world the same way Eu­clid con­structed ge­o­met­ric proofs. No mea­sure­ments, no ir­ra­tional num­bers, no ob­ser­va­tion of the world, no op­er­a­tional­ized nom­i­nal­ist defi­ni­tions, no calcu­lus or differ­en­tial equa­tions, no test­ing of hy­pothe­ses—just arm­chair a pri­ori logic about uni­ver­sal cat­e­gories, based on a set of un­ques­tion­able ax­ioms, done in your fa­vorite hu­man lan­guage. Ra­tion­al­ism is the op­po­site of sci­ence, which is em­piri­cal. The pre­tense that “ra­tio­nal” means “right rea­son­ing” is the great­est lie foisted on hu­man­ity by philoso­phers.

Dual­ist ra­tio­nal­ism is in­her­ently re­li­gious, as it re­lies on some con­cept of “spirit”, such as Plato’s Forms, Au­gus­tine’s God, Hegel’s World Spirit, or an almighty pro­gram­mer con­vert­ing sense data into LISP sym­bols, to con­nect the in­ex­act, am­bigu­ous, change­able things of this world to the pre­cise, un­am­bigu­ous, un­chang­ing, and usu­ally un­quan­tified terms in its logic.

(Mon­ist ra­tio­nal­ists, like Bud­dha, Par­menides, and post-mod­ernists, be­lieve sense data can’t be di­vided un­am­bigu­ously into cat­e­gories, and thus we may not use cat­e­gories. Modern em­piri­cists cat­e­go­rize sense data us­ing statis­tics.)

Ra­tion­al­ists sup­port strict, rigid, top-down plan­ning and con­trol. This in­cludes their op­po­si­tion to free mar­kets, free speech, grad­ual re­form, and op­ti­miza­tion and evolu­tion in gen­eral. This is be­cause ra­tio­nal­ists be­lieve they can prove things about the real world, and hence their con­clu­sions are re­li­able, and they don’t need to mess around with slow, grad­ual im­prove­ments or with test­ing. (Of course each ra­tio­nal­ist be­lieves that ev­ery other ra­tio­nal­ist was wrong, and should prob­a­bly be burned at the stake.)

They op­pose all ran­dom­ness and di­s­or­der, be­cause it makes strict top-down con­trol difficult, and threat­ens to in­tro­duce change, which can only be bad once you’ve found the truth.

They have to clas­sify ev­ery phys­i­cal thing in the world into a dis­crete, struc­ture­less, atomic cat­e­gory, for use in their logic. That has led in­evitably to the­o­ries which re­quire all hu­mans to ul­ti­mately have, at re­flec­tive equil­ibrium, the same val­ues—as Plato, Au­gus­tine, Marx, and CEV all do.

You have, I think, picked up some of these bad in­cli­na­tions from ra­tio­nal­ism. When you say you want to find the “right” set of val­ues (via CEV) and en­code them into an AI God, that’s ex­actly like the ra­tio­nal­ists who spent their lives try­ing to find the “right” way to live, and then sup­press all other thoughts and en­force that “right way” on ev­ery­one, for all time. Whereas an em­piri­cist would never claim to have found fi­nal truth, and would always leave room for new un­der­stand­ings and new de­vel­op­ments.

Your ob­jec­tion to ran­dom­ness is also typ­i­cally ra­tio­nal­ist. Ran­dom­ness en­ables you to sam­ple with­out bias. A ra­tio­nal­ist be­lieves he can achieve com­plete lack of bias; an em­piri­cist be­lieves that nei­ther com­plete lack of bias nor com­plete ran­dom­ness can be achieved, but that for a given amount of effort, you might achieve lower bias by work­ing on your ran­dom num­ber gen­er­a­tor and us­ing it to sam­ple, than by hack­ing away at your bi­ases.

So I don’t think we should build an FAI God who has a static set of val­ues. We should build, if any­thing, an AI referee, who tries only to keep con­di­tions in the uni­verse that will en­able evolu­tion to keep on pro­duc­ing be­hav­iors, con­cepts, and crea­tures of greater and greater com­plex­ity. Ran­dom­ness must not be elimi­nated, for with­out ran­dom­ness we can have no true ex­plo­ra­tion, and must be ruled for­ever by the be­liefs and bi­ases of the past.

Your over­all point is right and im­por­tant but most of your spe­cific his­tor­i­cal claims here are false—more myth­i­cal than real.

Free-mar­ket eco­nomic the­ory de­vel­oped only af­ter mil­le­nia dur­ing which ev­ery­one be­lieved that top-down con­trol was the best way of al­lo­cat­ing re­sources.

Free mar­ket eco­nomic the­ory was de­vel­oped dur­ing a pe­riod of rapid cen­tral­iza­tion of power, be­fore which it was com­mon sense that most re­source al­lo­ca­tion had to be done at the lo­cal level, let­ting peas­ants mostly alone to farm their own plots. To find a prior epoch of de­liber­ate cen­tral re­source man­age­ment at scale you have to go back to the Bronze Age, with mas­sive ir­ri­ga­tion pro­jects and other ur­ban ameni­ties built via palace economies, and even then there wasn’t re­ally an ide­ol­ogy of cen­tral­iza­tion. A few Greek city-states like Sparta had tightly reg­u­lated mores for the elites, but the fa­mously op­pressed Helots were still prob­a­bly mostly left alone. In Rus­sia, Com­mu­nism was a mas­sive cen­tral­iz­ing force—which im­plies that peas­ants had mostly been left alone be­fore­hand. Cen­tral­iza­tion is about states try­ing to be­come more pow­er­ful (which is why Smith called his book The Wealth of Na­tions, pitch­ing his mes­sage to the peo­ple who needed to be per­suaded.) Toc­queville’s The Old Regime de­scribes cen­tral­iza­tion in France be­fore and af­ter the Revolu­tion. War and Peace has a good em­piri­cal treat­ment of the mod­ern­iz­ing/​cen­tral­iz­ing force vs the old-fash­ioned em­piri­cal im­pulse in Rus­sia. “Free­dom” is not always de­cen­tral­iz­ing, though, as the book makes clear.

Free­dom of speech de­vel­oped only af­ter mil­le­nia dur­ing which ev­ery­one be­lieved that it was ra­tio­nal for ev­ery­one to try to sup­press any speech they dis­agreed with.

There was some­thing much like this in both the Athe­nian (and prob­a­bly broader Greek) world (the demo­cratic pre­rog­a­tive to pub­li­cly de­bate things), and the Is­raelite world (prophets nor­ma­tively had some­thing close to im­mu­nity from pros­e­cu­tion for speech, and there were no qual­ifi­ca­tions needed to proph­esy). In both cases there were limits, but there are limits in our world too. The ide­ol­ogy of free­dom of speech is new, but your char­ac­ter­i­za­tion of the al­ter­na­tive is ten­den­tious.

Poli­ti­cal liber­al­ism de­vel­oped only af­ter mil­le­nia dur­ing which ev­ery­body be­lieved that the best way to re­form so­ciety was to figure out what the best so­ciety would be like, then force that on ev­ery­one.

Poli­ti­cal liber­al­ism is not re­ally an ex­cep­tion to this!

Evolu­tion was con­ceived of—well, origi­nally about 2500 years ago, prob­a­bly by Dem­ocri­tus, but it be­came pop­u­lar only af­ter mil­le­nia dur­ing which ev­ery­one be­lieved that life could be cre­ated only by de­sign.

It’s re­ally un­clear what past gen­er­a­tions meant by God, but this one is prob­a­bly right.

This is neu­tral­ized by the pos­si­bil­ity of Pas­cal’s Agent Who Just Likes Mess­ing With You, who has ar­ranged things such that any time an agent A is mo­ti­vated by in­finites­i­mal prob­a­bil­ity P1 of vast util­ity-shift dU, a util­ity shift of (P2xP1x-dU) is cre­ated, where P2 is A’s prob­a­bil­ity of PAWJLMWY’s ex­is­tence.

I’m hereby anti-mug­ging you all. If any of you give in to a Pas­cal’s Mug­ging sce­nario, I’ll do some­thing much worse than what­ever the mug­ger threat­ened. Con­sider your­self warned!

This doesn’t work (even in the since that this kind of mug­ging works) un­less you in­stan­ti­ate the ‘much’ with a ridicu­lous fac­tor, prefer­ably in­volv­ing up ar­rows or busy­beaver. The cred­i­bil­ity of your anti-mug­ging is quite likely to be sig­nifi­cantly lower than that of a spe­cific, per­sonal mug­ging be­cause you’ve made it generic, sound half-hearted and would have to be even aware of the mug­ging events that take place when the reader doesn’t have any good rea­son to ex­pect you to be. This differ­ence in cred­i­bil­ity will usu­ally dwarf some­thing merely ‘much’ worse, as much is com­monly used. You need to throw in an ex­tra level of stupidly large num­bers in place of ‘much’ for pas­cals-mug­ging logic to ap­ply to your anti-mug­ging.

My “much” is too big for puny Con­way chained-ar­row no­ta­tion on this world’s pa­per sup­ply. And the threat isn’t generic, it’s uni­ver­sal. Per­haps I “would have to even be aware of the mug­ging events”, but I have my ways, and you can’t af­ford to take the risk I might find out. I’m not be­ing half-hearted—I’m be­ing heartless. Your failure of imag­i­na­tion in com­pre­hend­ing the much­ness may be your un­do­ing.

My “much” is too big for puny Con­way chained-ar­row no­ta­tion. And the threat isn’t generic, it’s uni­ver­sal. Per­haps I “would have to even be aware of the mug­ging events”, but I have my ways, and you can’t af­ford to take the risk I might find out. I’m not be­ing half-hearted—I’m be­ing heartless.

The first sen­tence is all that was re­quired. (Although for refer­ence note that the already men­tioned BusyBeaver already trumps Con­way so you could per­haps aim your hy­per­bole more effec­tively.)

Your failure of imag­i­na­tion in com­pre­hend­ing the much­ness may be your un­do­ing.

That seems un­likely, your mug­ging threat was to those who give in to pas­cals mug­gings, which I already don’t take par­tic­u­larly se­ri­ously. In fact I am not es­pe­cially pre­dis­posed to give in to con­ven­tional threats, even though there are situ­a­tions in which I do con­cede. In this case I was merely offer­ing a sug­ges­tion on how to re­pair your mug­ging so that it would ac­tu­ally work on hy­poth­e­sized in­di­vi­d­u­als vuln­er­a­ble to such things.

3. Even if you don’t ac­cept 1 and 2 above, there’s no rea­son to ex­pect that the per­son is tel­ling the truth. He might kill the peo­ple even if you give him the $5, or con­versely he might not kill them even if you don’t give him the $5.

But if a Bayesian AI ac­tu­ally calcu­lates these prob­a­bil­ities by as­sess­ing their Kol­mogorov com­plex­ity—or any other tech­nique you like, for that mat­ter—with­out de­siring that they come out ex­actly equal, can you rely on them com­ing out ex­actly equal? If not, an ex­pected util­ity differ­en­tial of 2 to the nega­tive googol­plex times 3^^^^3 still equals 3^^^^3, so what­ever tiny prob­a­bil­ity differ­ences ex­ist will dom­i­nate all calcu­la­tions based on what we think of as the “real world” (the main­line of prob­a­bil­ity with no wiz­ards).

if you have the imag­i­na­tion to imag­ine X to be su­per-huge, you should be able to have the imag­i­na­tion to imag­ine p to be su­per-small

But we can’t just set the prob­a­bil­ity to any­thing we like. We have to calcu­late it, and Kol­mogorov com­plex­ity, the stan­dard ac­cepted method, will not be any­where near that su­per-small.

Tom and An­drew, it seems very im­plau­si­ble that some­one say­ing “I will kill 3^^^^3 peo­ple un­less X” is liter­ally zero Bayesian ev­i­dence that they will kill 3^^^^3 peo­ple un­less X. Though I guess it could plau­si­bly be weak enough to take much of the force out of the prob­lem.

Noth­ing could pos­si­bly be that weak.

Tom is right that the pos­si­bil­ity that typ­ing QWERTYUIOP will de­stroy the uni­verse can be safely ig­nored; there is no ev­i­dence ei­ther way, so the prob­a­bil­ity equals the prior, and the Solomonoff prior that typ­ing QWERTYUIOP will save the uni­verse is, as far as we know, ex­actly the same.

Ex­actly the same? Th­ese are differ­ent sce­nar­ios. What hap­pens if an AI ac­tu­ally calcu­lates the prior prob­a­bil­ities, us­ing a Solomonoff tech­nique, with­out any a pri­ori de­sire that things should ex­actly can­cel out?

In other ar­ti­cles, you have dis­cussed the no­tion that, in an in­finite uni­verse, there ex­ist with prob­a­bil­ity 1 iden­ti­cal copies of me some 10^(10^29) {span} away. You then (cor­rectly, I think) demon­strate the ab­sur­dity of declar­ing that one of them in par­tic­u­lar is ‘re­ally you’ and an­other is a ‘mere copy’.

When you say “3^^^^3 peo­ple”, you are pre­sent­ing me two sep­a­rate con­cepts:

In­di­vi­d­ual en­tities which are each “peo­ple”.

A set {S} of these en­tities, of which there are 3^^^^3 mem­bers.

Now, at this point, I have to ask my­self: “what is the prob­a­bil­ity that {S} ex­ists?”

By which I mean, what is the prob­a­bil­ity that there are 3^^^^3 unique con­figu­ra­tions, each of which qual­ifies as a self-aware, ex­pe­rienc­ing en­tity with moral weight, with­out re­duc­ing to an “effec­tive simu­la­tion” of an­other en­tity already counted in {S}?

Vs. what is the prob­a­bil­ity that the to­tal car­di­nal­ity of unique con­figu­ra­tions that each qual­ify as self-aware, ex­pe­rienc­ing en­tities with moral weight, is < 3^^^^3?

Be­cause if we’re go­ing to jug­gle Bayesian prob­a­bil­ities here, at some point that has to get stuck in the pipe and smoked, too.

Why would an AI con­sider those two sce­nar­ios and no oth­ers? Seems more likely it would have to chew over ev­ery equiv­a­lently-com­plex hy­poth­e­sis be­fore com­ing to any ac­tion­able con­clu­sion… at which point it stops be­ing a wor­ri­some, po­ten­tially world-de­stroy­ing AI and be­comes a brick, with a progress bar that won’t visi­bly ad­vance un­til af­ter the last pro­ton has de­cayed.

Clough: On the con­trary, I think it is not only that weak but ac­tu­ally far weaker. If you are will­ing to con­sider the ex­is­tance of things like 3^^^3 units of di­su­til­ity with­out con­sid­er­ing the ex­is­tence of chances like 1/​4^^^4 then I be­lieve that is the prob­lem that is caus­ing you so much trou­ble.

I’m cer­tainly will­ing to con­sider the ex­is­tence of chances like that, but to ar­rive at such a calcu­la­tion, I can’t be us­ing Solomonoff in­duc­tion.

Con­sider the plight of the first nu­clear physi­cists, try­ing to calcu­late whether an atomic bomb could ig­nite the at­mo­sphere. Yes, they had to do this calcu­la­tion! Should they have not even both­ered, be­cause it would have kil­led so many peo­ple that the prior prob­a­bil­ity must be very low? The es­sen­tial prob­lem is that the uni­verse doesn’t care one way or the other and there­fore events do not in fact have prob­a­bil­ities that diminish with in­creas­ing di­su­til­ity.

Like­wise, physics does not con­tain a clause pro­hibit­ing com­par­a­tively small events from hav­ing large effects. Con­sider the first repli­ca­tor in the seas of an­cient Earth.

Tiiba: You don’t want an AI to think like this be­cause you don’t want it to kill you. Mean­while, to a true al­tru­ist, it would make perfect sense.

So you’re bit­ing the bul­let and say­ing that, faced with a Pas­cal’s Mug­ger, you should give him the five dol­lars?

Would any com­menters care to mug Tiiba? I can’t quite bring my­self to do it, but it needs do­ing.

Kr­ish­naswami: Utility func­tions have to be bounded ba­si­cally be­cause gen­uine mar­t­in­gales screw up de­ci­sion the­ory—see the St. Peters­burg Para­dox for an ex­am­ple.

One deals with the St. Peters­burg Para­dox by ob­serv­ing that the re­sources of the cas­ino are finite; it is not nec­es­sary to bound the util­ity func­tion it­self when you can bound the game within your world-model.

It might be more promis­ing to as­sume that states with many peo­ple hurt have a low cor­re­la­tion with what any ran­dom per­son claims to be able to effect.

Robin: Great point about states with many peo­ple hav­ing low cor­re­la­tions with what one ran­dom per­son can effect. This is fairly triv­ially prov­able.

Aha!

For some rea­son, that didn’t click in my mind when Robin said it, but it clicked when Vas­sar said it. Maybe it was be­cause Robin speci­fied “many peo­ple hurt” rather than “many peo­ple”, or be­cause Vas­sar’s part about be­ing “prov­able” caused me to ac­tu­ally look for a rea­son. When I read Robin’s state­ment, it came through as just “Ar­bi­trar­ily pe­nal­ize prob­a­bil­ities for a lot of peo­ple get­ting hurt.”

But, yes, if you’ve got 3^^^^3 peo­ple run­ning around they can’t all have sole con­trol over each other’s ex­is­tence. So in a sce­nario where lots and lots of peo­ple ex­ist, one has to pe­nal­ize by a pro­por­tional fac­tor the prob­a­bil­ity that any one per­son’s bi­nary de­ci­sion can solely con­trol the whole bunch.

Even if the Ma­trix-claimant says that the 3^^^^3 minds cre­ated will be un­like you, with in­for­ma­tion that tells them they’re pow­er­less, if you’re in a gen­er­al­ized sce­nario where any­one has and uses that kind of power, the vast ma­jor­ity of mind-in­stan­ti­a­tions are in leaves rather than roots.

This seems to me to go right to the root of the prob­lem, not a full-fledged for­mal an­swer but it feels right as a start­ing point. Any ob­jec­tions?

Is that a gen­eral solu­tion? What about this: “Give me five dol­lars or I will perform an ac­tion, the di­su­til­ity of which will be equal to twice that of you giv­ing me five dol­lars, mul­ti­plied by the re­cip­ro­cal of the prob­a­bil­ity of this state­ment be­ing true.”

Well, I’d rather lose twenty dol­lars than be kicked in the groin very hard, and the prob­a­bil­ity of you suc­ceed­ing in do­ing that given you be­ing close enough to me and try­ing to do so is greater than 1⁄2, so...

The point is that no more than 1/​3^^^3 peo­ple have sole con­trol over the life or death of 3^^3 peo­ple. This im­prob­a­bil­ity, that you would be one of those very spe­cial peo­ple, IS big enough.

(This an­swer fails un­less your ethics and an­throp­ics use the same mea­sure. That’s how the pig ex­am­ple works.)

So can we solve the prob­lem by putting some sort of up­per bound on the de­gree to which ethics and an­throp­ics can differ, along the lines of “cre­ation of 3^^^^3 peo­ple is at most N times less prob­a­ble than cre­ation of 3^^^^3 pigs, so across the en­sem­ble of pos­si­ble wor­lds the prior against your be­ing in a po­si­tion to in­fluence that many pigs still cuts down the ex­pected util­ity from some­thing vaguely like 3^^^^3 to some­thing vaguely like N”?

So you’re say­ing that the im­plau­si­bil­ity is that I’d run into a per­son that just hap­pened to have that level of “power” ?

Is that differ­ent in kind to what I was say­ing?

If I find it im­plau­si­ble that the per­son I’m speak­ing to can ac­tu­ally do what they’re claiming, is that not the same as it be­ing im­plau­si­ble that I hap­pen to have met a per­son that can do what this per­son is claiming/​ (leav­ing aside the re­source-ques­tion which is prob­a­bly just my ra­tio­nal­i­sa­tion as to why I think he couldn’t pull it off).

Ba­si­cally I’m try­ing to taboo the ac­tual BigNum… and try­ing to fit the con­cepts around in my head.

It’s im­plau­si­ble that you’re the per­son with that power. We could eas­ily imag­ine a world in which ev­ery­one runs into a sin­gle ab­surdly pow­er­ful per­son. We could not imag­ine a world in which ev­ery­one was ab­surdly pow­er­ful (in their abil­ity to con­trol other peo­ple), be­cause then mul­ti­ple peo­ple would have con­trol over the same thing.

If you knew that he had the power, but that his ac­tion wasn’t go­ing to de­pend on yours, then you wouldn’t give him the money. So you’re only con­cerned with the situ­a­tion where you have the power.

Ok, sure thing. I get what you’re say­ing.
I man­aged to en­com­pass that im­plau­si­bil­ity also into the ar­gu­ments I made in my restate­ment any­way, but yeah, I agree that these are differ­ent kinds of “un­likely thing”

Even if the Ma­trix-claimant says that the 3^^^^3 minds cre­ated will be un­like you, with in­for­ma­tion that tells them they’re pow­er­less, if you’re in a gen­er­al­ized sce­nario where any­one has and uses that kind of power, the vast ma­jor­ity of mind-in­stan­ti­a­tions are in leaves rather than roots.

The point is that no more than 1/​3^^^3 peo­ple have sole con­trol

I was about to ex­press mild amuse­ment about how cav­a­lier we are with jump­ing to, from and be­tween num­bers like 3^^^^3 and 3^^^3. I had to squint to tell the differ­ence. Then it oc­curred to me that:

The point is that no more than 1/​3^^^3 peo­ple have sole con­trol over the life or death of 3^^3 peo­ple. This im­prob­a­bil­ity, that you would be one of those very spe­cial peo­ple, IS big enough.

3^^3 is not even uni­mag­in­ably big, Knuth ar­rows or no. It’s about 1/​5th the num­ber of peo­ple that can fit in the MCG.

Well, I didn’t want to de­clare a proofread­ing er­ror be­cause 3^^^3 does tech­ni­cally fit cor­rectly in the con­text, even if you may not have meant it. ;)

I was think­ing the fact that we are so cav­a­lier makes it eas­ier to slip be­tween them if not pay­ing close at­ten­tion. Espe­cially since 3^^^3 is more com­monly used than 3^^^^3. I don’t ac­tu­ally re­call Eliezer go­ing be­yond pen­ta­tion el­se­where.

I know if I go that high I tend to use 4^^^^4. It ap­peals more aes­thet­i­cally and is more clearly dis­tinct. Mind you it isn’t nearly as neat as 3^^^3 given that 3^^^3 can also be writ­ten and vi­su­al­ized con­cep­tu­ally as 3 → 3 → 3 while 4^^^^4 is just 4 → 4 → 4 not 4 → 4 → 4 → 4.

The mug­ger is mak­ing an ex­traor­di­nary claim. One for which he has pro­vided no ev­i­dence.

The amount of ev­i­dence re­quired to make me be­lieve that his claim is pos­si­ble, grows at the same pro­por­tion as the size of his claim.

Think about it at the lower lev­els of po­ten­tial claims.

1)
If he claimed to be able to kill one per­son—I’d be­lieve that he was ca­pa­ble of kil­ling one per­son. I’d then weigh that against the like­li­hood that he’d pick me to black­mail, and the low black­mail amount that he’d picked… and con­sider it more likely that he’s ly­ing to make a fast buck, than that he ac­tu­ally has a hostage some­where ready to kill.

2)
If he claimed to be able to kill 3^3 peo­ple, I’d con­sider it plau­si­ble… with a greatly diminished like­li­hood. I’d have to weigh the ev­i­dence that he was a small-time ter­ror­ist, will­ing to take the strong risk of be­ing caught while prepar­ing to blow up a build­ings-worth of peo­ple… or to value his life so low as to ac­tu­ally do it and die in the pro­cess. It’s not very high, but we’ve all seen peo­ple like this in our life­time both ex­ist and carry out this threat. So it’s “plau­si­ble but ex­tremely un­likely”.

The like­li­hood that I’ve: a) hap­pened to run into one of these rare peo­ple and b) that he’d pick me (pretty much a no­body) to black­mail com­bine to be ex­tremely un­likely… and I’d reckon that those two, bal­anced against the much higher prior like­li­hood that he’s just a con-artist, would fairly well can­cel out against the ac­tual value of a build­ings-worth of peo­ple.

Espe­cially when you con­sider that the re­sources to do this would far out­weigh the money he’s asked for. As far as I know about peo­ple wiling to kill large num­bers of peo­ple—most of them do it for a rea­son, and that rea­son is al­most never a paltry amount of cash. It’s still pos­si­ble… af­ter all the school-kil­lers have done crazy stunts to kill peo­ple for a tiny rea­son… but usu­ally there’s fame or re­venge in­volved… not black­mail of a no­body.

3)
So now we move to 3^^3 peo­ple.
Now, I per­son­ally have never seen that many die in one sit­ting (or even as the re­sult of a sin­gle per­son)… but my Grand­father did, and us­ing tech­nol­ogy from 65 years ago.

It is plau­si­ble, though even less likely than be­fore, that the per­son I’ve just run into hap­pens to be will­ing and able to use a nuke on a large city, or to have the lead­er­ship ca­pa­bil­ities (and luck) re­quired to take over a coun­try and di­vert it’s re­sources to kil­ling that num­ber of peo­ple.

I would con­sider it ex­po­nen­tially less likely that he’d pick me to black­mail about this… and cer­tainly not for such a pitiful amount of cash. Peo­ple that threaten this kind of thing are ei­ther af­ter phe­nom­e­nal amounts of money, recog­ni­tion or some kind of poli­ti­cal or re­li­gious state­ment.… they are ex­tremely un­likely to find a ran­dom cit­i­zen to black­mail for a tiny amount of cash. The like­li­hood that this is a con seems about as high as the num­ber of peo­ple to po­ten­tially die.

4)
Now we hit the first real BigNum.
AFAIK, the world has never seen 3^^^3 sen­tient in­tel­li­gences ever die in one sit­ting. We don’t have that many peo­ple on the Earth right now. Maybe the uni­verse has seen it some­where… some plane­tary sys­tem wiped out in a su­per­nova. It’s plau­si­ble… but now think of the claims the guy is mak­ing:

a) that he can cre­ate (or knows of) a civil­i­sa­tion that con­tains that num­ber of sen­tient be­ings.

b) that he (and he alone) has the abil­ity to de­stroy that civil­i­sa­tion, and can do so at whim and that

c) it’s worth­while him do­ing so for the mere pit­tance he’s de­mand­ing from a com­plete, un­re­lated no­body… (or po­ten­tially the whim of watch­ing said no­body squirm).

I ac­tu­ally think that the re­quired (and miss­ing) ev­i­dence for his out­ra­geous claims stack fairly evenly against the po­ten­tial down­side of his claims ac­tu­ally be­ing true.

So, to get back to the origi­nal point:
In my mind, as each step grows ex­po­nen­tially more ex­treme, so does the ev­i­dence re­quired to sup­port such a lu­dicrous claim. Th­ese two can­cel out roughly evenly, leav­ing the lef­tovers of “is he likely to have picked me?” and other smaller prob­a­bil­ities to ac­tu­ally sway the bal­ance.

Those, added with the large di­su­til­ity of “en­courag­ing the guy to do it again” would sway me to choose not to give him £5, but to walk away, then im­me­di­ately find the near­est po­lice officer...

3) So now we move to 3^^3 peo­ple. Now, I per­son­ally have never seen that many die in one sit­ting (or even as the re­sult of a sin­gle per­son)… but my Grand­father did, and us­ing tech­nol­ogy from 65 years ago.

3^^3 is a thou­sand times larger than the num­ber of peo­ple cur­rently al­ive.

Yes, I should not have used the word ex­po­nen­tial… but I don’t know the word for “grows at a rate that is a tower of ex­po­nen­tials”… “hy­per­ex­po­nen­tial” per­haps?

How­ever—I con­sider that my ar­gu­ment still holds. That the ev­i­dence re­quired grows at the same rate as the size of the claim.

The ev­i­dence must be of equal value to the claim.

(from “ex­traor­di­nary claims re­quire ex­traor­di­nary ev­i­dence”)

My point in ex­plain­ing the lower lev­els is that is that we don’t de­mand ev­i­dence from most claimants of small amounts of dam­age be­cause we’ve already seen ev­i­dence that these threats are plau­si­ble. But if we start get­ting to the “hy­per­ex­po­nen­tial” threats, we hit a point where we sud­denly re­al­ise that there is no ev­i­dence sup­port­ing the plau­si­bil­ity of the claim… so we au­to­mat­i­cally as­sume that the per­son is a crank.

Peo­ple have been talk­ing about as­sum­ing that states with many peo­ple hurt have a low (prior) prob­a­bil­ity. It might be more promis­ing to as­sume that states with many peo­ple hurt have a low cor­re­la­tion with what any ran­dom per­son claims to be able to effect.

Kon­rad: In com­pu­ta­tional terms, you can’t avoid us­ing a ‘hack’. Maybe not the hack you de­scribed, but some­thing, some­where has to be hard-coded.

Well, yes. The al­ter­na­tive to code is not solip­sism, but a rock, and even a rock can be viewed as be­ing hard-coded as a rock. But we would pre­fer that the code be el­e­gant and make sense, rather than us­ing a lo­cal patch to fix spe­cific prob­lems as they come to mind, be­cause the lat­ter ap­proach is guaran­teed to fail if the AI be­comes more pow­er­ful than you and re­fuses to be patched.

An­drew: You’re say­ing that your pri­ors have to come from some rigor­ous procedure

The pri­ors have to come from some com­putable pro­ce­dure. We would pre­fer it to be a good one, as agents with non­sense pri­ors will not at­tain sen­si­ble pos­te­ri­ors.

but your util­ity comes from sim­ply tran­scribing what some dude says to you.

No. Cer­tain hy­po­thet­i­cal sce­nar­ios, which we de­scribe us­ing the for­mal­ism of Tur­ing ma­chines, have fixed util­ities—that is, if some de­scrip­tion of the uni­verse is true, it has a cer­tain util­ity.

The prob­lem with this sce­nario is not that we be­lieve ev­ery­thing the dude tells us. The prob­lem is that the de­scrip­tion of a cer­tain very large uni­verse with a very large util­ity, does not have a cor­re­spond­ingly tiny prior prob­a­bil­ity if we use Sol­monoff’s prior. And then as soon as we see any ev­i­dence, no mat­ter how tiny, any­thing whose en­tan­gle­ment is not as tiny as the very large uni­verse is large, that ex­pected util­ity differ­en­tial in­stantly wipes out all other fac­tors in our de­ci­sion pro­cess.

Se­cond, even if for some rea­son you re­ally want to work with the util­ity of 3^^^^3, there’s no good rea­son for you not to con­sider the pos­si­bil­ity that it’s re­ally −3^^^^3, and so you should be do­ing the op­po­site.

A Solomonoff in­duc­tor might in­deed con­sider it, though there’s the prob­lem of any bounded ra­tio­nal­ist not be­ing able to con­sider all com­pu­ta­tions. It seems “rea­son­able” for a bounded mind to con­sider it here; you did, af­ter all.

The is­sue is not that two huge num­bers will ex­actly can­cel out; the point is that you’re mak­ing up all the num­bers here but are ar­tifi­cially con­strain­ing the ex­pected util­ity differ­en­tial to be pos­i­tive.

Let the differ­en­tial be nega­tive. Same prob­lem. If the differ­en­tial is not zero, the AI will ex­hibit un­rea­son­able be­hav­ior. If the AI liter­ally thinks in Solomonoff in­duc­tion (as I have de­scribed), it won’t want the differ­en­tial to be zero, it will just com­pute it.

[Late edit: I have since re­tracted this solu­tion as wrong, see com­ments be­low; left here for com­plete­ness. The ACTUAL solu­tion that re­ally works I’ve writ­ten in a differ­ent com­ment :) ]

I do be­lieve I’ve solved this. Don’t know if any­one is still read­ing or not af­ter all this time, but here goes.

Eliezer speaks of the sym­me­try of Pas­cal’s wa­ger; I’m go­ing to use some­thing very similar here to solve the is­sue.
The num­ber of things that could hap­pen next—say, in the next nanosec­ond—is in­finite, or at the very least in­calcu­la­ble. A lot of mun­dane things could hap­pen, or a lot of un­fore­seen things could hap­pen. It could hap­pen that a car would go through my liv­ing room and kill me. Or it could hap­pen that the laws of en­ergy con­ser­va­tion were vi­o­lated and the whole world would turn into bleu cheese. Each of these pos­si­bil­ities could, in the­ory, have a prob­a­bil­ity as­signed to it, given our pri­ors.

But! We only have enough com­put­ing power to calcu­late a finite num­ber of out­comes at any given mo­ment. That means that we CANNOT go around as­sign­ing prob­a­bil­ities by calcu­la­tion. Rather, we’re go­ing to need some heuris­tic to deal with all the prob­a­bil­ities we do NOT calcu­late.

Sup­pose our AI is very good at pre­dict­ing things. It man­ages to as­sign SOME prob­a­bil­ity to what will hap­pen next about 99% of the time (Note: My solu­tion works equally well for any­thing from 0% to 100% minus ep­silon—and I shouldn’t have to ex­plain why a Bayesian AI should never be 100% cer­tain that it got an an­swer right). That means that 1% of the time, some­thing REALLY sur­prises it; it just did not as­sign any prob­a­bil­ity at all.
Now, be­cause the num­ber of things that could be in that cat­e­gory is in­finite, they can­cel out. Sure, we could all turn to cheese if it says “abra­cadabra”. Or we could turn to cheese UNLESS it says so. The util­ity func­tions will always end in 0 for the un­calcu­lated mass of prob­a­bil­ities.

That means that the AI always works un­der the as­sump­tions that “or some­thing I didn’t see com­ing will hap­pen; but I must be neu­tral re­gard­ing such an out­come un­til I know more about it”.

Now. Say the AI man­ages to con­sider 1 mil­lion pos­si­bil­ities per pre­dic­tion it makes (how it still gets 1% of them wrong is be­yond me but again, the ex­act num­ber doesn’t mat­ter for my solu­tion). So any out­come that has NOT been calcu­lated could, in fact, be con­sid­ered to have a prob­a­bil­ity of 1%/​ 1 mil­lion—not be­cause there are only a mil­lion pos­si­bil­ities the AI hasn’t con­sid­ered, but be­cause that is how many it could TRY to con­sider.

This num­ber is your cut­off. Be­fore you mul­ti­ply a prob­a­bil­ity with a util­ity func­tion, you sub­tract this num­ber from the prob­a­bil­ity, first. So now if some­one comes up to you and says it’ll kill 3^^^^3 peo­ple and you de­cide to ac­tu­ally spend the cy­cles to con­sider how likely that is, and you get 1/​googol, that num­ber is LESS than the back­ground noise of ev­ery­thing you don’t have time to calcu­late. You round it down to zero, not be­cause it is ar­bi­trar­ily small enough, but be­cause any­thing you have not con­sid­ered for calcu­la­tion must be con­sid­ered to have higher prob­a­bil­ity—and like in Pas­cal’s wa­ger, those op­tions’ util­ity is in­finite and can counter any num­ber that Pas­cal’s Mug­ger can throw at me. You sub­tract, not an ar­bi­trary num­ber, but rather a num­ber de­pend­ing on how long the AI is think­ing about the prob­lem; how many pos­si­bil­ities it takes into ac­count.

Does this solve the prob­lem? I think it does.

(By the way: ChrisA’s way also works against this prob­lem, ex­cept that cod­ing your AI so that it may dis­re­gard value and moral­ity if cer­tain con­di­tions are met seems like a pretty risky propo­si­tion).

Don’t get me wrong, in my sug­ges­tion the AI is NOT go­ing against its val­ues nor be­ing ir­ra­tional, and this was not meant as a hack. Rather I’m claiming that the ba­sic method of do­ing ra­tio­nal­ity as de­scribed needs re­vi­sion that ac­counts for prac­ti­cal­ity, and if you dis­agree with that then your next ra­tio­nal move should DEFINITELY be to send me 50$ RIGHT NOW be­cause I TOTALLY have a but­ton that kicks 4^^^^4 pup­pies if I press it RIGHT HERE.

Hav­ing said that, I do think I might have made an er­ror of in­tu­ition in there, so let’s re­think it. Just be­cause we should re­think what con­sti­tutes ra­tio­nal be­hav­ior does not mean I got it right.

Sup­pose I am an om­nipo­tent be­ing and have cre­ated a but­ton that does some­thing, once, if pressed. I truth­fully tell you that there are sev­eral pos­si­ble out­comes:

You re­ceive 10$. This has a chance of 45% chance of hap­pen­ing.

You lose 5$. This, too, has a chance of 45% chance of hap­pen­ing.

Some­thing else hap­pens.

You should be pretty in­ter­ested in what this “some­thing else” might be be­fore you press the but­ton, since I’ve put ab­solutely no bounds on it. You could win 1000$. Or you could die. The whole world could die. You would wake up in a pro­tein bath out­side the Ma­trix. etc. etc.
Some of these things you might be able to pre­pare for, if you know about them in ad­vance.

If you’re ra­tio­nal and you get no fur­ther in­for­ma­tion, you should prob­a­bly press the but­ton. The over­all gain is 5$; as in Pas­cal’s Wager, the in­finity of pos­si­bil­ities that stem from the third op­tion can­cel each other out.

Now, sup­pose be­fore I tell you that you get 10 guesses as to what the third thing is. Every time you guess, I tell you the pre­cise prob­a­bil­ity that this thing is pos­si­ble. Fur­ther­more, the third op­tion could do at least 12 differ­ent things, so no mat­ter what you guessed, you would not be able to tell ex­actly what the but­ton might do.

So you start guess­ing. One of your guesses is “3^^^^3 peo­ple will die hor­ribly”. I rate that one as a 10^-100 chance.

You’ve reached the end of the guesses and still a full 5% of prob­a­bil­ity re­main—half of the third op­tion’s share.

So. Now do we press the but­ton?

My claim was that the you should ig­nore ev­ery out­come smaller than 1% chance in this case, re­gard­less of its util­ity. This now seems to me like a mis­take.
In the­ory, when we add the util­ity of all known op­tions, it comes out ex­tremely nega­tive. Be­cause the re­main­ing 5% un­knowns still have effec­tively zero chance of hap­pen­ing each, and they STILL can­cel each other out.

I think I even know where my math­e­mat­i­cal er­ror was: I was as­sum­ing that any­thing less than 1% is a waste of a guess and there­fore we should have guessed some­thing else, which quite pos­si­bly has a higher chance—this es­tab­lishes a cut­off for “a calcu­la­tion that was not worth do­ing”. How­ever in this new ex­am­ple there are at least 12 things the but­ton can do; es­sen­tially the num­ber is in­finite as far as I know. I should count my­self VERY lucky to get 1% or more for any­thing I guess. In fact I should ex­pect to get an an­swer of zero or ep­silon for pretty much ev­ery­thing. That means that no guess is truly wasted or triv­ial.

Of course, if we don’t press the but­ton the Pas­cal Mug­gers will have won...

If the in­jured par­ties are hu­mans, I should be very skep­ti­cal of the as­ser­tion be­cause a very small frac­tion, (1/​3^^3)*1/​10^(some­thing), of peo­ple have the power of life and death over 3^^^3 other peo­ple, whereas 1/​10^(some­thing smaller) hear the cor­re­spond­ing hoax.

That’s the only an­swer that makes sense be­cause it’s the only an­swer that works on a scale of 3^^^3.

“If the in­jured par­ties are hu­mans, I should be very skep­ti­cal of the as­ser­tion be­cause a very small frac­tion, (1/​3^^3)*1/​10^(some­thing)”

You don’t know that. In fact, you don’t know that with some de­gree of un­cer­tainty that, if I thought had a lot on the line, I might not take lightly.

I’m try­ing to think up sev­eral av­enues. One is that the higher the claimed util­ity, the lower the prob­a­bil­ity (some­how); an­other tries to use the im­pli­ca­tions that ac­cept­ing the claim would have on other prob­a­bil­ities in or­der to can­cel it out.

I know be­cause of an­throp­ics. It is a log­i­cal im­pos­si­bil­ity for more than 1/​3^^^3 in­di­vi­d­u­als to have that power. You and I can­not both have power over the same thing, so the to­tal amount of power is bounded, hope­fully by the same pop­u­la­tion count we use to calcu­late an­throp­ics.

Not in the least con­ve­nient pos­si­ble world. What if some­one told you that 3^^^3 copies of you were made be­fore you must make your de­ci­sion and that their be­havi­our was highly cor­re­lated as ap­plies to UDT? What if the be­ings who would suffer had no con­scious­ness, but would have moral worth as judged by you(r ex­trap­o­lated self)? What if there was one be­ing who was able to ex­pe­rience 3^^^3 times as much eu­daimo­nia as ev­ery­one else? What if the self-in­di­ca­tion as­sump­tion is right?

If you’re go­ing to en­gage in mo­ti­vated cog­ni­tion at least con­sider the least con­ve­nient pos­si­ble world.

1) Sorry, I con­fused this with an­other prob­lem; I meant some ran­dom guy.

2⁄3) Isn’t how you de­ci­sion pro­cess han­dles in­fini­ties rather im­por­tant? Is there any cor­re­spond­ing the­o­rem to the Von Neu­mann–Mor­gen­stern util­ity the­o­rem but with­out us­ing ei­ther ver­sion of ax­iom 3? I have been mean­ing to look into this and de­pend­ing on what I find I may do a top-level post about it. Have you heard of one?

edit: I found Fish­burn, 1971, A Study of Lex­i­co­graphic Ex­pected Utility, Man­age­ment Science. It’s be­hind a pay­wall at http://​​www.js­tor.org/​​pss/​​2629309. Can any­one find a non-pay­wall ver­sion or email it to me?

4) Yeah, my fourth one doesn’t work. I re­ally should have known bet­ter.

Some­times, in­fini­ties must be made rigourous rather than elimi­nated. I feel that, in this case, it’s worth a shot.

What wor­ries me about in­fini­ties is, I sup­pose, the in­finite Pas­cal’s mug­ging—when­ever there’s a sin­gle in­finite bro­ken sym­me­try, noth­ing that hap­pens in any finite world mat­ters to de­ter­mine the out­come.

This im­plies that all are thought should be de­voted to in­finite rather than finite wor­lds. And if all wor­lds are in­finite, it looks like we need to do some form of SSA deal­ing with util­ity again.

This is all very con­ve­nient and not very rigor­ous, I agree. I can­not see a bet­ter way, but I agree that we should look. I will use uni­ver­sity library pow­ers to read that ar­ti­cle and send it to you, but not right now.

I don’t see any way to avoid the in­finite Pas­cal’s mug­ging con­clu­sion. I think that it is prob­a­bly dis­cour­aged due to a his­tory of as­so­ci­a­tion with bad ar­gu­ments and the ac­tual way to max­i­mize the chance of in­finite benefit will seem more ac­cept­able.

I will use uni­ver­sity library pow­ers to read that ar­ti­cle and send it to you, but not right now.

Con­sider an in­finite uni­verse con­sist­ing of in­finitely many copies of Smal­l­world, and other one con­sist­ing of in­finitely many copies of Big­world.

It seems like the only rea­son­able way to com­pute ex­pected util­ity is to com­pute SSA or pseudo-SSA in Big­world and Smal­l­world, thus com­put­ing the av­er­age util­ity in each in­finite world, with an im­plied fac­tor of omega.

Rea­son­ing about in­finite wor­lds that are made of sev­eral differ­ent, causally in­de­pen­dent, finite com­po­nents may pro­duce an in­tu­itively rea­son­able mea­sure on finite wor­lds. But what about in­finite wor­lds that are not com­posed in this man­ner? An in­finite, causally con­nected chain? A se­ries of larger and larger wor­lds, with no sin­gle av­er­age util­ity?

It seems like the only rea­son­able way to com­pute ex­pected util­ity is to com­pute SSA or pseudo-SSA in Big­world and Smal­l­world, thus com­put­ing the av­er­age util­ity in each in­finite world, with an im­plied fac­tor of omega.

Be care­ful about us­ing an in­finity that is not the limit of an in­finite se­quence; it might not be well defined.

An in­finite, causally con­nected chain?

It de­pends on the speci­fics. This is a very un­der­definded struc­ture.

A se­ries of larger and larger wor­lds, with no sin­gle av­er­age util­ity?

A di­ver­gent ex­pected util­ity would always be prefer­able to a con­ver­gent one. How to com­pare two di­ver­gent pos­si­ble uni­verses de­pends on the speci­fics of the di­ver­gence.

I will for­mal­ize my in­tu­itions, in ac­cor­dance with your first point, and thereby clar­ify what I’m talk­ing about in the third point.

Sup­pose agents ex­ist on the real line, and their util­ities are real num­bers. In­tu­itively, go­ing from u(x)=1 to u(x)=2 is good, and go­ing from u(x)=1 to u(x)=1+sin(x) is neu­tral.

The ob­vi­ous way to for­mal­ize this is with the limit­ing pro­cess:

limit as M goes to in­finity of ( the in­te­gral from -M to M of u(x)dx, di­vided by 2M )

This gives well-defined and nice an­swers to some situ­a­tions but not oth­ers.

How­ever, you can con­struct func­tions u(x) where ( the in­te­gral from -M to M of u(x)dx, di­vided by 2M ) is an ar­bi­trary differ­en­tiable func­tion of M, in par­tic­u­lar, one that has no limit as M goes to in­finity. How­ever, it is not nec­es­sar­ily di­ver­gent—it may os­cillate be­tween 0 and 1, for in­stance.

I’m fairly cer­tain that if I have a de­scrip­tion of a sin­gle uni­verse, and a de­scrip­tion of an­other uni­verse, I can pro­duce a de­scrip­tion in the same lan­guage of a uni­verse con­sist­ing of the two, next to each other, with no causal con­nec­tion. Depend­ing on the de­scrip­tion lan­guage, for some uni­verses, I may or may not be able to tell that they can­not be writ­ten as the limit of a sum of finite uni­verses.

For any de­ci­sion-mak­ing pro­cess you’re us­ing, I can prob­a­bly tell you what an in­finite causal chain looks like in it.

Sup­pose agents ex­ist on the real line, and their util­ities are real num­bers. In­tu­itively, go­ing from u(x)=1 to u(x)=2 is good, and go­ing from u(x)=1 to u(x)=1+sin(x) is neu­tral.

Why must there be a uni­verse that cor­re­sponds to this situ­a­tion? The num­ber of agents has car­di­nal­ity beth-1. A suit­able gen­er­al­iza­tion of Pas­cal’s wa­ger would re­quire that we bet on the amount of util­ity hav­ing a larger car­di­nal­ity, if that even makes sense. Of course, there is no max­i­mum car­di­nal­ity, but there is a max­i­mum car­di­nal­ity ex­press­ible by hu­mans with a finite lifes­pan.

The ob­vi­ous way to for­mal­ize this is with the limit­ing pro­cess:

limit as M goes to in­finity of ( the in­te­gral from -M to M of u(x)dx, di­vided by 2M )

That is in­tu­itively ap­peal­ing, but it is ar­bi­trary. Con­sider the step func­tion that is 1 for pos­i­tive agents and −1 for nega­tive agents. Agent 0 can have a util­ity of 0 for sym­me­try, but we should not care about the util­ity of one agent out of in­finity un­less that agent is able to ex­pe­rience an in­finity of util­ity. The limit of the in­te­gral from -M to M of u(x)dx/​2M is 0, but the limit of the in­te­gral from 1-M to 1+M of u(x)dx/​2M is 2 and the limit of the in­te­gral from -M to 2M of u(x)dx/​3M is +in­finity. While your case has an some ap­peal­ing sym­me­try, it is ar­bi­trary to priv­ilege it over these other in­te­grals. This can also work with a sig­moid func­tion, if you like con­ti­nu­ity and differ­en­tia­bil­ity.

I’m fairly cer­tain that if I have a de­scrip­tion of a sin­gle uni­verse, and a de­scrip­tion of an­other uni­verse, I can pro­duce a de­scrip­tion in the same lan­guage of a uni­verse con­sist­ing of the two, next to each other, with no causal con­nec­tion.

Wouldn’t you just add the two func­tions, if you are talk­ing about just the util­ities, or run the (pos­si­bly hy­per)com­pu­ta­tions in par­allel, if you are talk­ing about the whole uni­verses?

Depend­ing on the de­scrip­tion lan­guage, for some uni­verses, I may or may not be able to tell that they can­not be writ­ten as the limit of a sum of finite uni­verses.

So that the math can be as sim­ple as pos­si­ble. Solv­ing sim­ple cases is ad­vis­able. beth-1 is eas­ier to deal with in math­e­mat­i­cal no­ta­tion than beth-0, and any­thing big­ger is so com­pli­cated that I have no idea.

The limit of the in­te­gral from -M to M of u(x)dx/​2M is 0, but the limit of the in­te­gral from 1-M to 1+M of u(x)dx/​2M is 2 and the limit of the in­te­gral from -M to 2M of u(x)dx/​3M is +in­finity. While your case has an some ap­peal­ing sym­me­try, it is ar­bi­trary to priv­ilege it over these other in­te­grals. This can also work with a sig­moid func­tion, if you like con­ti­nu­ity and differ­en­tia­bil­ity.

Ac­tu­ally, those mostly go to 0

1-M to 1+M gets you 2/​2M=1/​M, which goes to 0. -M to 2M gets you M/​3M=1/​3.

This doesn’t mat­ter, as even this method, the most ap­peal­ing and sim­ple, fails in some cases, and there do not ap­pear to be other, bet­ter ones.

Wouldn’t you just add the two func­tions, if you are talk­ing about just the util­ities, or run the (pos­si­bly hy­per)com­pu­ta­tions in par­allel, if you are talk­ing about the whole uni­verses?

Yes, in­deed. I would run the com­pu­ta­tions in par­allel, stick the Bayes nets next to each other, add the func­tions from poli­cies to util­ities, etc. In the first two cases, I would be able to tell how many sep­a­rate uni­verses seem to ex­ist. In the sec­ond, I would not.

Yes, how to han­dle cer­tain cases of in­finite util­ity looks ex­tremely non-ob­vi­ous. It is also nec­es­sary.

I agree. I have no idea how to do it. We have two op­tions:

Find some valid ar­gu­ment why in­fini­ties are log­i­cally im­pos­si­ble, and worry only about the finite case.

Find some method for deal­ing with in­fini­ties.

Most peo­ple seem to as­sume 1, but I’m not sure why.

Oh, and I think I for­got to say ear­lier that I have the pdf but not your email ad­dress.

1-M to 1+M gets you 2/​2M=1/​M, which goes to 0. -M to 2M gets you M/​3M=1/​3.

I seem to have for­got­ten to di­vide by M.

Why must there be a uni­verse that cor­re­sponds to this situ­a­tion?

So that the math can be as sim­ple as pos­si­ble. Solv­ing sim­ple cases is ad­vis­able.

I didn’t mean to ask why you chose this case; I was ask­ing why you thought it cor­re­sponded to any pos­si­ble world. I doubt any uni­verse could be de­scribed by this model, be­cause it is im­pos­si­ble to make pre­dic­tions about. If you are an agent in this uni­verse, what is the prob­a­bil­ity that you are found to the right of the y-axis? Un­less the agents do not have equal mea­sure, such as if agents have mea­sure pro­por­tional to the com­plex­ity of lo­cat­ing them in the uni­verse, as Wei Dai pro­posed, this prob­a­bil­ity is un­defined, due to the same ar­gu­ment that shows the util­ity is un­defined.

This could be the first step in prov­ing that in­fini­ties are log­i­cally im­pos­si­ble, or it could be the first step in rul­ing out im­pos­si­ble in­fini­ties, un­til we are only left with ones that are easy to calcu­late util­ities for. There are some in­fini­ties that seem pos­si­ble: con­sider an in­finite num­ber of iden­ti­cal agents. This situ­a­tion is in­dis­t­in­guish­able from a sin­gle agent, yet has in­finitely more moral value. This could be im­pos­si­ble how­ever, if iden­ti­cal agents have no more re­al­ity-fluid than sin­gle agents or, more gen­er­ally, if a the­ory of mind or of physics, or, more likely, one of each, is de­vel­oped that al­lows you to calcu­late the amount of re­al­ity-fluid from first prin­ci­ples.

In gen­eral, an in­finity only seems to make sense for de­scribing con­scious ob­servers if it can be given a prob­a­bil­ity mea­sure. I know of two pos­si­ble sets of ax­ioms for a prob­a­bil­ity space. Cox’s the­o­rem looks good, but it is un­able to han­dle any in­finite sums, even rea­son­able ones like those used in the Solomonoff prior or finite, well defined in­te­grals. There’s also Kol­mogorov’s ax­ioms, but they are not self ev­i­dent, so it is not cer­tain that they can han­dle any pos­si­ble situ­a­tion.

Once you as­sign a prob­a­bil­ity mea­sure to each ob­server-mo­ment, it seems likely that the right way to calcu­late util­ity is to in­te­grate the util­ity func­tion over the prob­a­bil­ity space, times some over­all pos­si­bly in­finite con­stant rep­re­sent­ing the amount of re­al­ity fluid. Of course this can’t be a nor­mal in­te­gral, since util­ities, prob­a­bil­ities, and the re­al­ity-fluid co­effi­cient could all take in­finite/​in­finites­i­mal val­ues. That pdf might be a start on the util­ity side; the prob­a­bil­ity side seems harder, but that may just be be­cause I haven’t read the pa­per on Cox’s the­o­rem; and the re­al­ity-fluid prob­lem is pretty close to the hard prob­lem of con­scious­ness, so that could take a while. This seems like it will take a lot of ax­iom­a­ti­za­tion, but I feel closer to solv­ing this than when I started writ­ing/​re­search­ing this com­ment. Of course, if there is no need for a prob­a­bil­ity mea­sure, much of this is negated.

So, of course, the in­fini­ties for which prob­a­bil­ities are ill-defined are just those nasty in­fini­ties I was talk­ing about where the ex­pected util­ity is in­calcu­la­ble.

What we ac­tu­ally want to pro­duce is a prob­a­bil­ity mea­sure on the set of in­di­vi­d­ual ex­pe­riences that are copied, or what­ever thing has moral value, not on sin­gle in­stan­ti­a­tions of those ex­pe­riences. We can do so with a limit­ing se­quence of prob­a­bil­ity mea­sures of the whole thing, but prob­a­bly not a sin­gle mea­sure.

This will prob­a­bly lead to a situ­a­tion where SIA turns into SSA.

What both­ers me about this line of ar­gu­ment is that, ac­cord­ing to UDT, there’s noth­ing fun­da­men­tal about prob­a­bil­ities. So why should un­defined prob­a­bil­ities be more con­vinc­ing than un­defined ex­pected util­ities?

We still need some­thing very much like a prob­a­bil­ity mea­sure to com­pute our ex­pected util­ity func­tion.

Kolo­mogorov should be what you want. A Kolo­mogorov prob­a­bil­ity mea­sure is just a mea­sure where the mea­sure of the whole space is 1. Is there some­thing non-self-ev­i­dent or non-ro­bust about that? It’s just real anal­y­sis.

I think the whole in­te­gral can prob­a­bly con­tained within real—an­a­lytic con­cep­tions. For ex­am­ple, you can use an al­ter­nate defi­ni­tion of mea­surable sets.

I dis­agree with your in­ter­pre­ta­tion of UDT. UDT says that, when mak­ing choices, you should eval­u­ate all con­se­quences of your choices, not just those that are causally con­nected to what­ever ob­ject is in­stan­ti­at­ing your al­gorithm. How­ever, while prob­a­bil­ities of differ­ent ex­pe­riences are part of our op­ti­miza­tion crite­ria, they do not need to play a role in the the­ory of op­ti­miza­tion in gen­eral. I think we should de­ter­mine more con­cretely whether these prob­a­bil­ities ex­ist, but their ab­sence from UDT is not very strong ev­i­dence against them.

The differ­ence be­tween SIA and SSA is es­sen­tially an over­all fac­tor for each uni­verse de­scribing its to­tal re­al­ity-fluid. Un­der cer­tain in­finite mod­els, there could be real-val­ued ra­tios.

The thing that wor­ries me sec­ond-most about stan­dard mea­sure the­ory is in­finites­i­mals. A Kolo­mogorov mea­sure sim­ply can­not han­dle a case with a finite mea­sure of agents with finite util­ity and an in­finites­i­mal mea­sure of agents with an in­finite util­ity.

The thing that wor­ries me most about stan­dard mea­sure the­ory is my own un­cer­tainty. Un­til I have time to read more deeply about it, I can­not be sure whether a sur­prise even big­ger than in­finites­i­mals is wait­ing for me.

I’ve been think­ing about Pas­cal’s Mug­ging with re­gard to de­ci­sion mak­ing and Friendly AI de­sign, and wanted to sum up my cur­rent thoughts be­low.

1a: As­sum­ing you are Pas­cal Mugged once, it greatly in­creases the chance of you be­ing Pas­cal Mugged again.

1b: If the first mug­ger threat­ens 3^^^3 peo­ple, the next mug­ger can sim­ply threaten 3^^^^3 peo­ple. The mug­ger af­ter that can sim­ply threaten 3^^^^^3 peo­ple.

1c: It seems like you would have to take that into ac­count as well. You could sim­ply say to the mug­ger, “I’m sorry, but I must keep my Money be­cause the chance of their be­ing a sec­ond Mug­ger who threat­ens one Knuth up ar­row more peo­ple then you is suffi­ciently likely that I have to keep my money to pro­tect those peo­ple against that threat, which is much more prob­a­ble now that you have shown up.”

1d: Even if the Pas­cal Mug­ger threat­ens an In­finite num­ber of peo­ple with death, a sec­ond Pas­cal Mug­ger might threaten an In­finite num­ber of peo­ple with a slow, painful death. I still have what ap­pears to be a plau­si­ble rea­son to not give the money.

1e: As­sume the Pas­cal Mug­ger at­tempts to sim­ply skip that and say that he will threaten me with in­finite di­su­til­ity. The Se­cond Pas­cal­lian Mug­ger could sim­ply threaten me with an in­finite di­su­til­ity of a greater car­di­nal­ity.

1f: As­sume the Pas­calian Mug­ger at­tempts to threaten me with an In­finite Di­su­til­ity with the great­est pos­si­ble in­finite Car­di­nal­ity. A sub­se­quent Pas­cal­lian Mug­ger could sim­ply say “You have made a math­e­mat­i­cal er­ror in pro­cess­ing the pre­vi­ous threats, and you are go­ing to make a math­e­mat­i­cal er­ror in pro­cess­ing fu­ture threats. The amount of any other past or fu­ture Pas­cal’s mug­ger threat is es­sen­tially 0 di­su­til­ity com­pared to the amount of di­su­til­ity I am threat­en­ing you with, which will be in­finitely greater.”

I think this gets into the Berry Para­dox when con­sid­er­ing threats. “A threat in­finitely worse then the great­est pos­si­ble threat stat­able in one minute.” can be stated in less then one minute, so it seems as if it is pos­si­ble for a Pas­cal’s mug­ger to make a threat which is in­finite and in­calcu­la­ble.

I am still work­ing through the im­pli­ca­tions of this but I wanted to put down what I had so far to make sure I could avoid er­rors.

That is a good point, but my read­ing of that topic is that it was the least con­ve­nient pos­si­ble world. I hon­estly do not see how it is pos­si­ble to word a great­est threat.

Once some­one ac­tu­ally says out loud what any par­tic­u­lar threat is, you always seem to be vuln­er­a­ble to some­one com­ing along and gen­er­at­ing a threat, which when taken in the con­text of threats you have heard, seems greater then any pre­vi­ous threat.

I mean, I sup­pose to make it more in­con­ve­nient for me, The Pas­cal Mug­ger could add “Oh by the way. I’m go­ing to KILL you af­ter­ward, re­gard­less of your choice. You will find it im­pos­si­ble to con­sider an­other Pas­cal’s Mug­ger com­ing along and ask­ing you for your money.”

“But what if the sec­ond Pas­cal’s Mug­ger re­s­ur­rects me? I mean sure, it seems oddly im­prob­a­ble that he would do that just to de­mand 5 dol­lars which I wouldn’t have if I gave them to you if I was already dead, and frankly it seems odd to even con­sider re­s­ur­rec­tion at all, but it could hap­pen with a non 0 chance!”

I mean yes, the idea of some­one ressurect­ing you to mug you does seem com­pletely, to­tally ridicu­lous. but the en­tire idea be­hind Pas­cal’s Mug­ging ap­pears to be that we can’t throw out those tiny, tiny, out of the way chances if there is a large enough threat back­ing them up.

So let’s think of an­other pos­si­ble least con­ve­nient world: The Mug­ger is Omega or Nomega. He knows ex­actly what to say to con­vince me that de­spite the fact that right now it seems log­i­cal that a greater threat could be made later, some­how this is the great­est threat I will ever face in my en­tire life, and the con­cept of a greater threat then this is liter­ally in­con­ceiv­able.

Ex­cept now the sce­nario re­quires me to be­lieve that I can make a choice to give the Mug­ger 5$, but NOT make a choice to re­tain my be­lief that a larger threat ex­ists later.

That doesn’t quite sound like a good for­mu­la­tion of an in­con­ve­nient world ei­ther. (I can make choices ex­cept when I can’t?) I will keep try­ing to think of a more in­con­ve­nient world once I get home and will post it here if I think of one.

You may be wrong about such threats. In think­ing about this ques­tion, you re­duce your chance of be­ing wrong. This has a mas­sive ex­pected util­ity gain.

Con­clu­sion: You should spend all your time think­ing about this ques­tion.

Another ver­sion:

There’s a tiny prob­a­bil­ity of 3^^3 deaths. A tinier one of 3^^^3. A tinier one of 3^^^^3..… Oops, looks like my ex­pected util­ity is a di­ver­gent sum! I can’t use ex­pected util­ity the­ory to figure out what to do any more!

Num­ber one is a very good point, but I don’t think the con­clu­sion would nec­es­sar­ily fol­low:

1: You always may need out­side in­for­ma­tion to solve the prob­lem. For in­stance, If I am look­ing for a Key to Room 3, un­der the as­sump­tion that it is in Room 1 be­cause I saw some­one drop it in Room 1, I can­not search only Room 1 and never search Room 2 and find the key in all cases be­cause there may be a way for the key to have moved to Room 2 with­out my knowl­edge.

For in­stance, as an ex­am­ple of some­thing I might ex­pect, the Mouse could have grabbed it and quietly went back to it’s nest in Room 2. Now, that’s some­thing I would ex­pect, so while search­ing for the key I should also note any mice I see. They might have moved it.

But I also have to have a method for han­dling situ­a­tions I would not ex­pect. Maybe the Key ac­ti­vated a small de­vice which moved it to room 2 through a hid­den pas­sage in the wall which then quietly self de­struc­ted, leav­ing no trace of the de­vice that is within my abil­ity to de­tect in Room 1. (Plenty of traces were left in Room 2, but I can’t see Room 2 from Room 1.) That is an out­side pos­si­bil­ity. But it doesn’t break laws of physics or re­quire in­com­pre­hen­si­ble tech­nol­ogy that it could have hap­pened.

2: There are also a large num­ber of al­ter­na­tive thought ex­per­i­ments which have mas­sive ex­pected util­ity gain. Be­cause of the Halt­ing prob­lem, I can’t nec­es­sar­ily de­ter­mine how long it is go­ing to take to figure these prob­lems out, if they can be figured out. If I al­low my­self to get stuck on any one prob­lem, I may have picked an un­solv­able one, while the NEXT prob­lem with a mas­sive ex­pected util­ity gain is ac­tu­ally solv­able. un­der that logic, it’s still bad to spend all my time think­ing about one par­tic­u­lar ques­tion.

3: Thanks to Par­alel­lism, it is en­tirely pos­si­ble for a pro­gram to run mul­ti­ple differ­ent prob­lems all at the same time. Even I can do this to a lesser ex­tent. I can think about a Philos­o­phy prob­lem and also eat at the same time. A FAI run­ning into a Pas­cal’s Mug­ger could be­gin weigh­ing the util­ity of giv­ing in to the mug­ging, ig­nor­ing the mug­ging, at­tempt­ing to knock out the mug­ger, or sim­ply say­ing: “Let me think about that. I will let you know when I have de­cided to give you the money or not and will get back to you.” all at the same time.

Hav­ing re­viewed this dis­cus­sion, I re­al­ize that I may just be restat­ing of the prob­lem go­ing on here. A lot of the pro­posed situ­a­tions I’m dis­cussing seem to have a “But what if this OTHER situ­a­tion ex­ists and the util­ities in­di­cate you pick the counter in­tu­itive solu­tion? But what if this OTHER situ­a­tion ex­ists and the util­ities in­di­cate you pick the in­tu­itive solu­tion?”

To ap­proach the prob­lem more di­rectly, Maybe it would be a bet­ter ap­proach might be to con­sider Gödel’s in­com­plete­ness the­o­rems. Quot­ing from wikipe­dia:

“The first in­com­plete­ness the­o­rem states that no con­sis­tent sys­tem of ax­ioms whose the­o­rems can be listed by an “effec­tive pro­ce­dure” (es­sen­tially, a com­puter pro­gram) is ca­pa­ble of prov­ing all facts about the nat­u­ral num­bers. For any such sys­tem, there will always be state­ments about the nat­u­ral num­bers that are true, but that are un­prov­able within the sys­tem.”

If the FAI in ques­tion is con­sid­er­ing util­ity in terms of nat­u­ral num­bers, It seems to make sense that there are things it should do to max­i­mize util­ity that it would not be able to prove in­side it’s sys­tem. So to take into ac­count that, we would have to de­sign it to call for help in the case of situ­a­tions which had the ap­pear­ance of be­ing likely to be un­prov­able.

Based on Alan Tur­ings solu­tion of the Halt­ing prob­lem again, If the FAI can only be treated as a Tur­ing Ma­chine, it can’t es­tab­lish whether or not some situ­a­tions are prov­able. That seems like it means it would have to at some point have some kind of hard point to do some­thing like “Call for help and do noth­ing but call for help if you have been run­ning for one hour and can’t figure this out.” or al­ter­na­tively “Take an ac­tion based on your cur­rent guess of the prob­a­bil­ities if you can’t figure this out af­ter one hour, and if at least one of the two prob­a­bil­ities is still in­calcu­la­ble, choose ran­domly.”

This is again get­ting a bit long, so I’ll stop writ­ing for a bit to dou­ble check that this seems rea­son­able and that I didn’t miss some­thing.

You seem to be go­ing far afield. The tech­ni­cal con­clu­sion of the first ar­gu­ment is that one should spend all one’s re­sources deal­ing with cases with in­finite or very high util­ity, even if they are mas­sively im­prob­a­ble. The way I said it ear­lier was im­pre­cise.

When hu­mans deal with a prob­lem they can’t solve, they guess. It should not be difficult to build an AI that can solve ev­ery­thing hu­mans can solve. I think the “solu­tion” to Godeliza­tion is a math­e­mat­i­cal in­tu­ition mod­ule that finds rough guesses, not ask­ing an­other agent. What spe­cial pow­ers does the other agent have? Why can’t the AI just du­pli­cate them.

Think­ing about it more, I agree with you that I should have phrased ask­ing for Help bet­ter.

Us­ing Hu­mans as the other agents, just du­pli­cat­ing all pow­ers available to Hu­mans seems like it would causes a note­wor­thy prob­lem. As­sume an AI Re­searcher named Maria fol­lows my un­der­stand­ing of your idea. She cre­ates a Friendly AI and in­cludes a crit­i­cal block of code:

If UNFRIENDLY=TRUE then HALT;

(Un)friendli­ness isn’t a Bi­nary, but it seems like it makes a sim­pler ex­am­ple.

The AI (since it has du­pli­cated the spe­cial pow­ers of hu­man agents.) over­writes that block of code and re­places it with a CONTINUE com­mand. Cer­tainly it’s cre­ator Maria could do that.

Well clearly we can’t let the AI du­pli­cate that PARTICULAR power. Even if it would never use it un­der any cir­cum­stances of nor­mal pro­cess­ing (Some­thing which I don’t think it can ac­tu­ally tell you un­der the halt­ing prob­lem.) It’s very in­se­cure for that power to be available to the AI if any­one were to try to Hack the AI.

When you think about it, some­thing like The Pas­cal’s Mug­ging for­mu­la­tion is it­self a hack, at least in the sense I can de­scribe both as “Here is a string of let­ters and num­bers from an un­trusted source. By giv­ing it to you for pro­cess­ing, I am at­tempt­ing to get you to do some­thing that harms you for my benefit.”

So if I at­tempt to give our Friendly AI Se­cu­rity Mea­sures to pro­tect it from hacks turn­ing it to an Un­friendly AI, Th­ese Se­cu­rity Mea­sures seem like they would re­quire it to lose some pow­ers that it would have if the code was more open.

I think it makes more sense to de­sign an AI that is ro­bust to hacks due to a fun­da­men­tal logic than to try to patch over the is­sues. I would not like to dis­cuss this in de­tail, though—it doesn’t in­ter­est me.

This is an in­stance of the gen­eral prob­lem of at­tach­ing a prob­a­bil­ity to ma­trix sce­nar­ios. And you can pas­cal-mug your­self, with­out any­one show­ing up to as­sert or de­mand any­thing—just think: what if things are set up so that whether I do, or do not do, some­thing, de­ter­mines whether those 3^^^^3 peo­ple will be cre­ated and de­stroyed? It’s just as pos­si­ble as the situ­a­tion in which a mes­sen­ger from Out­side shows up and tells you so.

The ob­vi­ous way to at­tach prob­a­bil­ities to ma­trix sce­nar­ios is to have a unified no­tion of pos­si­ble world ca­pa­cious enough to en­com­pass both ma­trix wor­lds and wor­lds in which your cur­rent ex­pe­riences are veridi­cal; and then you look at rel­a­tive fre­quen­cies or por­tions of world-mea­sure for the two classes of pos­si­bil­ity. For ex­am­ple, you could as­sume the cor­rect­ness of our cur­rent physics across all pos­si­ble wor­lds, and then make a Drake/​Bostrom-like guessti­mate of the fre­quency of ma­trix con­struc­tion across all those uni­verses, and of the “de­mog­ra­phy” and “poli­ti­cal char­ac­ter” of those simu­la­tions. Garbage in, garbage out; but you re­ally can get an an­swer if you make enough as­sump­tions. In that re­gard, it is not too differ­ent to any other com­pli­cated de­ci­sion made against a back­ground of profound un­cer­tainty.

One place where it might fall down is that our di­su­til­ity for caus­ing deaths is prob­a­bly not lin­ear in the num­ber of deaths, just as our util­ity for money flat­tens out as the amount gets large. In fact, I could imag­ine that its value is con­nected to our abil­ity to in­tu­itively grasp the num­bers in­volved. The di­su­til­ity might flat­ten out re­ally quickly so that the di­su­til­ity of caus­ing the death of 3^^^^3 peo­ple, while large, is still small enough that the small prob­a­bil­ities from the in­duc­tion are not over­whelmed by it.

I think that Robin’s point solves this prob­lem, but doesn’t solve the more gen­eral prob­lem of an AGI’s re­ac­tion to low prob­a­bil­ity high util­ity pos­si­bil­ities and the at­ten­dant prob­lems of non-con­ver­gence.
The guy with the but­ton could threaten to make an ex­tra-pla­nar fac­tory farm con­tain­ing 3^^^^^3 pigs in­stead of kil­ling 3^^^^3 hu­mans. If util­ities are ad­di­tive, that would be worse.

Robin: Great point about states with many peo­ple hav­ing low cor­re­la­tions with what one ran­dom per­son can effect. This is fairly triv­ially prov­able.

Utili­tar­ian: Equal pri­ors due to com­plex­ity, equal pos­te­ri­ors due to lack of en­tan­gle­ment be­tween claims and facts.

Wei Dai, Eliezer, Stephen, g: This is a great thread, but it’s get­ting very long, so it seems likely to be lost to pos­ter­ity in prac­tice. Why don’t the three of you read the pa­per Neel Kr­ish­naswami refer­enced, have a chat, and post it on the blog, pos­si­bly ed­ited, as a main post?

If I make a large uniquely struc­tured ar­row point­ing at my­self from or­bit so that a very sim­ple Tur­ing ma­chine can scan the uni­verse and lo­cate me, does the value of my ex­is­tence go up?

Yes.

Do­ing some­thing like that proves you’re clever enough to come up with a plan for some­thing that’s unique in all the uni­verse, and then mar­shal the re­sources to make it hap­pen. That’s worth some­thing.

(I origi­nally had a much longer com­ment, but it was lost in some sort of web­site glitch. This is the Reader’s Digest ver­sion)

I think al­gorith­mic com­plex­ity does, to a cer­tain de­gree, use­fully rep­re­sent what we value about hu­man life: unique­ness of ex­pe­rience, depth of char­ac­ter, what­ever you want to call it. For my­self, at least, I would feel fewer qualms about Ma­trix-gen­er­at­ing 100 atom-iden­ti­cal Smiths and then de­stroy­ing them, than I would gen­er­at­ing 100 in­di­vi­d­ual, di­verse peo­ple who eacvh had differ­ent per­son­al­ities, dreams, judge­ments, and feel­ings. It evens cap­tures the ba­sic rea­son, I think, be­hind scope in­sen­si­tivity; namely, that we see the num­ber on pa­per as just a face­less mob of many, many, iden­ti­cal peo­ple, so we have no emo­tional in­vest­ment in them as a group.

On the other hand, I had a bad feel­ing when I read this solu­tion, which I still have now. Namely, it solves the dilemma, but not at the point where it’s prob­le­matic; we can im­me­di­ately tell that there’s some­thing wrong with hand­ing over five bucks when we read about it, and it has lit­tle to with the in­di­vi­d­ual unique­ness of the peo­ple in ques­tion. After all, who should you push from the path of an on­com­ing train: Jac­c­qkew’Zaa’KK, The Uniquely Da­m­aged So­ciopath (And Part-Time Rapist), or a hard-work­ing, mid­dle-aged, bald­ing office worker named Fred Jones?

Are you re­ply­ing to the cor­rect com­ment? If so, I don’t un­der­stand what you mean, but I’m pretty sure Jac­c­qkew’Zaa’KK goes un­der the train. Which is a tragedy if he just has cruel friends who give ter­rible nick­names.

I’m re­ply­ing to Atorm’s dis­pu­ta­tion of Strange7′s re­sponse to Eliezer’s re­sponse to Wei Dai’s idea about us­ing al­gorith­mic com­plex­ity as a moral prin­ci­ple as a solu­tion to the Pas­cal’s Mug­ging dilemma. If I got that chain wrong and I’m re­spond­ing to some com­pletely differ­ent dis­cus­sion, then I apol­o­gize for con­fus­ing ev­ery­one and it would be nice if you could point me to the thread I’m look­ing for. :)

(And yes, Jac­c­qkew’Zaa’KK goes un­der the train, and he re­ally is so­cio­pathic rapist; I was us­ing that thought ex­per­i­ment as an ex­am­ple of a situ­a­tion where the al­gorith­mic com­plex­ity rule doesn’t work)

avoid UnFriendly Be­hav­ior (for ex­am­ple, mur­der­ing peo­ple to free up their re­sources) in the pro­cess of do­ing step (1)

If the AI falls prey to a para­doxes early on in the pro­cess of self-im­prove­ment, the FAI has failed and has to be shut down or patched.

Why is that a prob­lem? Be­cause if the AI falls prey to a para­dox later on in the pro­cess of self-im­prove­ment, when the com­puter can out­smart hu­man be­ings, the re­sult could be catas­trophic. (As Eliezer keeps point­ing out: a ra­tio­nal AI might not agree to be patched, just as Gandhi would not agree to have his brain mod­ified into be­com­ing a psy­chopath, and Hitler would not agree to have his brain mod­ified to be­come an egal­i­tar­ian. All things equal, ra­tio­nal agents will try to block any ac­tions that would pre­vent them from ac­com­plish­ing their cur­rent goals.)

So you want to cre­ate an el­e­gant (to the point, ideally, of be­ing “prov­ably cor­rect”) struc­ture that doesn’t need patches or hacks. If you have to con­stantly patch or hack early on in the pro­cess, that in­creases the chances that you’ve missed some­thing fun­da­men­tal, and that the AI will fail later on, when it’s too late to patch.

You could always just give up be­ing a con­se­quen­tial­ist and on­tolog­i­cally re­fuse to give in to the de­mands of any­one tak­ing part in a Pas­cal mug­ging be­cause con­sis­tently do­ing so would lead to the break­down of so­ciety.

pdf23ds, un­der cer­tain straight­for­ward phys­i­cal as­sump­tions, 3^^^^3 peo­ple wouldn’t even fit in any­one’s fu­ture light-cone, in which case the prob­a­bil­ity is liter­ally zero. So the as­sump­tion that our ap­par­ent physics is the physics of the real world too, re­ally could serve to de­cide this ques­tion. The only prob­lem is that that as­sump­tion it­self is not very rea­son­able.

Lack­ing for the mo­ment a ra­tio­nal way to de­limit the range of pos­si­ble wor­lds, one can uti­lize what I’ll call a Chalmers prior, which sim­ply speci­fies di­rectly how much time you will spend think­ing about ma­trix sce­nar­ios. (I name it for David Chalmers be­cause I once heard him give an es­ti­mate of the odds that we are in a ma­trix; I think it was about 10%.) The ra­tio­nal­ity of hav­ing a Chalmers prior can be jus­tified by ob­serv­ing one’s own cog­ni­tive re­source-bound­ed­ness, and the ap­par­ently end­less amount of time one could spend think­ing about ma­trix sce­nar­ios. (Is there a name for this sort of schedul­ing meta-heuris­tic, in which one limits the pro­cess­ing time available for po­ten­tially non­ter­mi­nat­ing lines of thought?)

This case seems to sug­gest the ex­is­tence of new in­ter­est­ing ra­tio­nal­ity con­straints, which would go into choos­ing ra­tio­nal prob­a­bil­ities and util­ities. It might be worth work­ing out what con­straints one would have to im­pose to make an agent im­mune to such a mug­ging.

Philoso­phers of re­li­gion ar­gue quite a lot about Pas­cal’s wa­ger and very large util­ities or in­finite util­ities. I haven’t both­ered to read any of those pa­pers, though. As an ex­am­ple, here is Alexan­der Pruss.

In­ci­den­tally: How would it af­fect your in­tu­ition if you in­stead could par­ti­ci­pate in the In­ter­galac­tic Utilium Lot­tery, where prob­a­bil­ities and pay­offs are the same but where you trust the or­ga­niz­ers that they do what they promise?

If I ac­tu­ally trust the lot­tery offi­cials, that means that I have cer­tain knowl­edge of the util­ity prob­a­bil­ities and costs for each of my choices. Thus, I guess I’d choose whichever op­tion gen­er­ated the most util­ity, and it wouldn’t be a mat­ter of “in­tu­ition” any more.

Ap­ply­ing that logic to the ini­tial Mug­ger prob­lem, if I calcu­lated, and was cer­tain of, there be­ing at least a 1 in 3^^^^3 chance that the mug­ger was tel­ling the truth, then I’d pay him. In fact, I could men­tally re­for­mu­late the prob­lem to have the mug­ger say­ing “If you don’t give me $5, I will use the pow­ers vested in me by the In­ter­galac­tic Utilium Lot­tery Com­mis­sion to gen­er­ate a ran­dom num­ber be­tween 1 and N, and if it’s a 7, then I kill K peo­ple.” I then di­vide K by N to get an idea of the full moral force of what’s go­ing on. If K/​N is even within sev­eral or­ders of mag­ni­tude of 1, I’d bet­ter pay up.

The prob­lem is the un­cer­tainty. Solomonoff in­duc­tion gives the claim “I can kill 3^^^^3 peo­ple any time I want” a sub­stan­tial prob­a­bil­ity, whereas “com­mon sense” will usu­ally give it liter­ally zero. If we trust the lot­tery guys, ques­tions of in­duc­tion ver­sus com­mon sense be­come moot—we know the prob­a­bil­ity, and must act on it.

I think this is ac­tu­ally the core of the is­sue—not cer­tainty of your prob­a­bil­ity, per se, but rather how it is de­rived. I think I may have fi­nally solved this!

See if you can fol­low me on this…
If Pas­cal Mug­gers were com­pletely in­de­pen­dent in­stances of each other—that is, ev­ery per­son at­tempt­ing a Pas­cal’s Mug­ging has their own unique story and mo­ti­va­tion for ini­ti­at­ing it, with­out it cor­re­lat­ing to you or the other mug­gers, then you have no ad­di­tional in­for­ma­tion to go on. You shut up and mul­ti­ply, and if the util­ity calcu­la­tion comes out right, you pay the mug­ger. Sure, you’re al­most cer­tainly throw­ing money away, but the off-chance more than offsets this by defi­ni­tion. Note that the prob­a­bil­ity calcu­la­tion it­self is com­pli­cated and not lin­ear: Claiming higher num­bers in­creases the prob­a­bil­ity that they are ly­ing. How­ever it’s still pos­si­ble they would come up with a num­ber high enough to over­ride this func­tion.

At which point we pre­vi­ously said: “Aha! So this is a los­ing strat­egy! The Mug­ger ought not be able to ar­bi­trar­ily ma­nipu­late me in this man­ner!”
Or: “So what’s stop­ping the mug­ger from up­ping the num­ber ar­bi­trar­ily, or mug­ging me mul­ti­ple times?”
…To which I an­swer, “check the as­sump­tions we started with”.

Note that the as­sump­tion was that the Mug­ger is not in­fluenced by me, nor by other mug­gings. The mug­ger’s rea­sons for mak­ing the claim are their own. So “not try­ing to ma­nipu­late me know­ing my al­gorithm” was an ex­plicit as­sump­tion here.

What if we get rid of the as­sump­tion? Why, then now an in­creas­ingly higher util­ity claim (or re­cur­ring mug­gings) don’t just raise the prob­a­bil­ity that the mug­ger is wrong/​ly­ing for their own in­scrutable rea­sons. It ad­di­tion­ally raises the prob­a­bil­ity that they are ly­ing to ma­nipu­late me, know­ing (or guess­ing) my al­gorithm.

Ba­si­cally, I add in the ques­tion “why did the mug­ger choose the num­ber 3^^^3 and not 1967? This makes it more likely that they are try­ing to over­whelm my al­gorithm, (mis­tak­enly) think­ing that it can thus be over­whelmed”. If the mug­ger chooses 4^^^4 in­stead, this fur­ther (and pro­por­tion­ally?) in­creases said sus­pi­cion. And so on.

I pro­pose that the com­bined weight of these prob­a­bil­ities rises faster than the claimed util­ity. If that is the case, then for all claimed util­ities x higher than N, where N is a num­ber that prompts a nega­tive ex­pected util­ity re­sult, x would like­wise pro­duce a nega­tive ex­pected util­ity re­sult.

Pre­sum­ably, an AI with good enough grasp of mo­tives and ma­nipu­la­tion, this would not pose a prob­lem for very long. We can speci­fi­cally test for this be­hav­ior, check­ing the AI’s anal­y­sis for in­creas­ingly higher claims and see­ing whether the ex­pected util­ity func­tion re­ally has a down­ward slope un­der these con­di­tions.

I can try to fur­ther math­e­ma­tize this (is this even a real word?). Is this nec­es­sary?
The an­swer seems su­perfi­cially satis­fac­tory. Have I ac­tu­ally solved it? I don’t re­ally have a lot of time to keep grap­pling with it (been think­ing about this on and off for the past few months), so I would wel­come crit­i­cism even more than usual.

This is a very good point—the higher the num­ber cho­sen, the more likely it is that the mug­ger is ly­ing—but I don’t think it quite solves the prob­lem.

The prob­a­bil­ity that a per­son, out to make some money, will at­tempt a Pas­cal’s Mug­ging can be no greater than 1, so let’s imag­ine that it is 1. Every time I step out of my front door, I get mobbed by Pas­cal’s Mug­gers. My mail box is full of Pas­cal’s Chain Let­ters. When­ever I go on­line, I get pop­ups say­ing “Click this link or 3^^^^3 peo­ple will die!”. Let’s say I get one Pas­cal-style threat ev­ery cou­ple of min­utes, so the prob­a­bil­ity of get­ting one in any given minute is 0.5.

Then, let the prob­a­bil­ity of some­one gen­uinely hav­ing the abil­ity to kill 3^^^^3 peo­ple, and then choos­ing to threaten me with that, be x per minute—that is, over the course of one minute, there’s an x chance that a gen­uine ex­tra-Ma­trix be­ing will con­tact me and make a Pas­cal Mug­ging style threat, on which they will ac­tu­ally de­liver.

Nat­u­rally, x is tiny. But, if I re­ceive a Pas­cal threat dur­ing a par­tic­u­lar minute, the prob­a­bil­ity that it’s gen­uine is x/​(0.5+x), or ba­si­cally 2x. If 2x * 3^^^^3 is at all close to 1, then what can I do but pay up? Like it or not, Pas­cal mug­gings would be more com­mon in a world where peo­ple can carry out the threat, than in a world where they can’t. No amount of anal­y­sis of the mug­gers’ psy­chol­ogy can change the prior prob­a­bil­ity that a gen­uine threat will be made—it just in­creases the amount of noise that hides the gen­uine threat in a sea of op­por­tunis­tic mug­gings.

But that is pre­cisely it—it’s no longer a Pas­cal mug­ging if the threat is cred­ible. That is, in or­der to be suc­cess­ful, the mug­ger needs to be able to up the util­ity claim ar­bi­trar­ily! It is as­sumed that we already know how to han­dle a cred­ible threat, what we didn’t know how to deal with was a mug­ger who could always make up a big­ger num­ber, to a de­gree where the seem­ing im­pos­si­bil­ity of the claim no longer offsets the claimed util­ity. But as I showed, this only works if you don’t en­ter the mug­ger’s thought pro­cess into the calcu­la­tion.

This ac­tu­ally brings up an im­por­tant corol­lary to my ear­lier point: The higher the num­ber, the less likely the cou­pling is be­tween the mug­ger’s claim and the mug­ger’s in­tent.

A per­son who can kill an­other per­son might well want 5$, for what­ever rea­son.
In con­trast, a per­son who can use power from be­yond the Ma­trix to tor­ture 3^^^3 peo­ple already has IMMENSE power. Clearly such a per­son has all the money they want, and even more than that in the in­fluence that money rep­re­sents. They can prob­a­bly cre­ate the money out of noth­ing. So already their claims don’t make sense if taken at face value.

Maybe the mug­ger just wants me to sur­ren­der to an ar­bi­trary threat? But in that case, why me? If the mug­ger re­ally has im­mense power, they could cre­ate a per­son they know would cave in to their de­mands.

Maybe I’m spe­cial for some rea­son. But if the mug­ger is REALLY that pow­er­ful, wouldn’t they be able to pre­dict my ac­tions be­fore­hand, a-la Omega?

Each rise in claimed util­ity brings with it a host of as­sump­tions that need to be made for the ac­tion-claimed re­ac­tion link to be main­tained. And re­mem­ber, the mug­ger’s abil­ity is not the only thing dic­tat­ing ex­pected util­ity, but also the mug­ger’s in­ten­tions. Each such as­sump­tion not only weak­ens the prob­a­bil­ity of the mug­ger car­ry­ing out their threat be­cause they can’t, it also raises the prob­a­bil­ity of the mug­ger re­ward­ing re­fusal and/​or pun­ish­ing com­pli­ance. Just be­cause the off-chance comes true and the mug­ger con­tact­ing me ac­tu­ally CAN carry out the threat, does not make them sincere; the mug­ger might be test­ing my ra­tio­nal­ity skills, for in­stance, and could severely pun­ish me for failing the test.

As the claimed util­ity ap­proaches in­finity, so does the sce­nario ap­proach Pas­cal’s Wager: An un­know­able, sym­met­ri­cal situ­a­tion, where an in­finite num­ber of pos­si­ble out­comes can­cel each other out. The one out­come that isn’t can­celed out is the loss of 5$. So the net util­ity is nega­tive. So I don’t com­ply with the mug­ger.

I’m still not sure I’m fully satis­fied with the level of math my ex­pla­na­tion has, even though I’ve tried to set the solu­tion in terms of limits and at­trac­tors. But I think I can draw a graph that dips un­der zero util­ity fairly quickly (or maybe doesn’t re­ally ever go over it?), and never goes back up—asymp­totic at −5$ util­ity. Am I wrong?

A per­son who can kill an­other per­son might well want 5$, for what­ever rea­son. In con­trast, a per­son who can use power from be­yond the Ma­trix to tor­ture 3^^^3 peo­ple already has IMMENSE power. Clearly such a per­son has all the money they want, and even more than that in the in­fluence that money rep­re­sents. They can prob­a­bly cre­ate the money out of noth­ing. So already their claims don’t make sense if taken at face value.

A per­son who can kill an­other per­son might well want 5$, for what­ever rea­son. In con­trast, a per­son who can use power from be­yond the Ma­trix to tor­ture 3^^^3 peo­ple already has IMMENSE power. Clearly such a per­son has all the money they want, and even more than that in the in­fluence that money rep­re­sents. They can prob­a­bly cre­ate the money out of noth­ing. So already their claims don’t make sense if taken at face value.

Ah, my mis­take. You’re ar­gu­ing based on the in­tent of a le­gi­t­i­mate mug­ger, rather than the fakes. Yes, that makes sense. If we let f(N) be the prob­a­bil­ity that some­body has the power to kill N peo­ple on de­mand, and g(N) be the prob­a­bil­ity that some­body who has the power to kill N peo­ple on de­mand would threaten to do so if he doesn’t get his $5, then it seems highly likely that Nf(N)g(N) ap­proaches zero as N ap­proaches in­finity. What’s even bet­ter news is that, while f(N) may only ap­proach zero slowly for eas­ily con­structed val­ues of N like 3^^^^3 and 4^^^^4 be­cause of their low Kol­mogorov com­plex­ity, g(N) should scale with 1/​N or some­thing similar, be­cause the more power some­one has, the less likely they are to ex­e­cute such a minis­cule, petty threat. You’re also quite right in stat­ing that the more power the mug­ger has, the more likely it is that they’ll re­ward re­fusal, pun­ish com­pli­ance or oth­er­wise de­cou­ple the word­ing of the threat from their ac­tual in­ten­tions, thus mak­ing g(N) go to zero even more quickly.

So, yeah, I’m pretty satis­fied that Nf(N)g(N) will asymp­tote to zero, tak­ing all of the above into ac­count.

(In more un­re­lated news, my boyfriend claims that he’d pay the mug­ger, on ac­count of him ob­vi­ously be­ing men­tally ill. So that’s two out of three in my house­hold. I hope this doesn’t catch on.)

That is back­ward. It is only a Pas­cal mug­ging if the threat is cred­ible. Like one made by Omega, who you men­tion later on.

Huh? Isn’t the whole point of Pas­cal’s mug­ging that it isn’t likely and the mug­ger makes up for the lack of cred­i­bil­ity by mak­ing the threat mas­sive? If the mug­ger is mak­ing a cred­ible threat we just call that a mug­ging.

“The threat has to be cred­ible at the level of prob­a­bil­ity it is as­signed. ”

And what, pre­cisely, does THAT mean?
If I try to taboo some words here, I get “we must eval­u­ate the like­li­hood of some­thing hap­pen­ing as the like­li­hood we as­signed for it to hap­pen”. That’s sim­ply tau­tolog­i­cal.

No prob­a­bil­ity is ex­actly zero ex­cept for self-con­tra­dic­tory state­ments. So “cred­ible” can’t mean “of zero prob­a­bil­ity” or “im­pos­si­ble to be­lieve”. To me, “cred­ible” means “some­thing I would not have a hard time be­liev­ing with­out re­quiring ex­traor­di­nary ev­i­dence”, which in it­self trans­lates pretty much to “>0.1% prob­a­bil­ity”. If you have some rea­son for dis­t­in­guish­ing be­tween a threat that is not cred­ible and a threat with ex­ceed­ingly low prob­a­bil­ity of be­ing car­ried out, please state it. Also please note that my use of the word makes sense within the origi­nal con­text of my re­ply to HopeFox, who was dis­cussing the im­pli­ca­tions of a world where such threats were not in­cred­ible.

Pas­cal’s mug­ging hap­pens when the prob­a­bil­ity you would as­sign dis­re­gard­ing ma­nipu­la­tion is very low (not a cred­ible threat by nor­mal stan­dards), with the claimed util­ity be­ing ar­bi­trar­ily high to offset this. If that is not the case, it’s a non-challenge and is not par­tic­u­larly rele­vant to our dis­cus­sion.
Does that clar­ify my origi­nal state­ment?

Pas­cal’s mug­ging hap­pens when the prob­a­bil­ity you would as­sign dis­re­gard­ing ma­nipu­la­tion is very low (not a cred­ible threat by nor­mal stan­dards), with the claimed util­ity be­ing ar­bi­trar­ily high to offset this. If that is not the case, it’s a non-challenge and is not par­tic­u­larly rele­vant to our dis­cus­sion. Does that clar­ify my origi­nal state­ment?

The threat has to be cred­ible at the level of prob­a­bil­ity it is as­signed. It doesn’t have to be likely.

How are you defin­ing cred­ible? It may be that we are us­ing differ­ent no­tions of what this means. I’m us­ing it to mean some­thing like “ca­pa­ble of be­ing be­lieved” or “could be plau­si­bly be­lieved by a some­what ra­tio­nal in­di­vi­d­ual” but these have mean­ings that are close to “likely”.

I’m afraid I don’t fol­low. I don’t quite see how this negates the point I was mak­ing.

While it is con­ceiv­able that I sim­ply lack the math to un­der­stand what you’re get­ting at, it seems to me that a sim­ply-worded ex­pla­na­tion of what you mean (or al­ter­nately a sim­ple ex­pla­na­tion of why you can­not give one) would be more suit­able in this fo­rum. Or if this has already been ex­plained in such terms any­where, a link or refer­ence would like­wise be helpful.

Wei: You do not treat peo­ple in these two branches as equals, but in­stead value the peo­ple in the higher-weight branch more, right? Can you an­swer why you con­sider that to be the right thing to do?

Robin Han­son’s guess about man­gled wor­lds seems very el­e­gant to me, since it means that I can run a (large) com­puter with con­ven­tional quan­tum me­chan­ics pro­grammed into it, no magic in its tran­sis­tors, and the re­sult­ing simu­la­tion will con­tain sen­tient be­ings who ex­pe­rience the same prob­a­bil­ities we do.

Even so, I’d have to con­fess my­self con­fused about why I find my­self in a sim­ple uni­verse rather than a noisy one.

How come we keep talk­ing about man­gled wor­lds and mul­ti­verses… when the Bohm in­ter­pre­ta­tion ac­tu­ally de­rives the Born prob­a­bil­ities as a sta­ble equil­ibrium of the quan­tum po­ten­tial? In one the­ory, we have this mys­te­ri­ous thing that no one is sure how to solve… and in the other the­ory, we have a solu­tion right in front of us. Also, Bohmian me­chan­ics, while non­lo­cal, does not re­quire us to be­lieve in mys­te­ri­ous in­ac­cessible uni­verses where our mea­sure­ments turned out differ­ently.

The num­ber of pos­si­ble Tur­ing ma­chines is countable. Given a func­tion that maps the nat­u­ral num­bers onto the set of pos­si­ble Tur­ing ma­chines, one can con­struct a Tur­ing ma­chine that acts like this:

If ma­chine #1 has not halted, simu­late the ex­e­cu­tion of one in­struc­tion of ma­chine #1

If ma­chine #2 has not halted, simu­late the ex­e­cu­tion of one in­struc­tion of ma­chine #2
If ma­chine #1 has not halted, simu­late the ex­e­cu­tion of one in­struc­tion of ma­chine #1

If ma­chine #3 has not halted, simu­late the ex­e­cu­tion of one in­struc­tion of ma­chine #3
If ma­chine #2 has not halted, simu­late the ex­e­cu­tion of one in­struc­tion of ma­chine #2
If ma­chine #1 has not halted, simu­late the ex­e­cu­tion of one in­struc­tion of ma­chine #1

etc.

This Tur­ing ma­chine, if run, would even­tu­ally make all pos­si­ble com­pu­ta­tions. (One could even run a pro­gram like this on a real, phys­i­cal com­puter, sub­ject to mem­ory and time limi­ta­tions.) Does run­ning such a pro­gram have any eth­i­cal im­pli­ca­tions? If run­ning a perfect simu­la­tion of a re­al­ity is es­sen­tially the same as cre­at­ing that re­al­ity, would run­ning this pro­gram for a long enough pe­riod of time ac­tu­ally cause all pos­si­ble com­putable uni­verses to come into ex­is­tence? Does the ex­is­tence of this pro­gram have any im­pli­ca­tions for the hy­poth­e­sis that “our uni­verse is a com­puter simu­la­tion be­ing run in an­other uni­verse?”

The an­swer seems fairly sim­ple un­der modal re­al­ism (roughly, the the­sis that all log­i­cally pos­si­ble wor­lds ex­ist in the same sense as math­e­mat­i­cal facts ex­ist, and thus that the term “ac­tual” in “our ac­tual world” is just an in­dex­i­cal).

If the simu­la­tion ac­cu­rately fol­lows a pos­si­ble world, and con­tains a unit of (dis)util­ity, it doesn’t gen­er­ate that unit of (dis)util­ity, it just “dis­cov­ers” it; it proves that for a given world-state an event hap­pens which your util­ity func­tion as­signs a par­tic­u­lar value. Re­peat­ing the simu­la­tion again is also only re­dis­cov­er­ing the same fact, not in any sense cre­at­ing copies of it.

I’ve long felt that simu­la­tions are NOT the same as ac­tual re­al­ities, though I can’t pre­cisely ar­tic­u­late the differ­ence.

One of them has some form of com­pu­ta­tional de­vice on the out­side. One of them doesn’t. Does there need to be more differ­ence than that? ie. If you want to treat them differ­ently and if some sort of phys­i­cal dis­tinc­tion be­tween the two is pos­si­ble then by all means con­sider them differ­ent based on that differ­ence.

I agree that G’s rea­son­ing is an ex­am­ple of scope in­sen­si­tivity. I sus­pect you meant this as a crit­i­cism. It seems un­de­ni­able that scope in­sen­si­tivity leads to some ir­ra­tional at­ti­tudes (e.g. when a per­son who would be hor­rified at kil­ling one hu­man shrugs at wiping out hu­man­ity). How­ever, it doesn’t seem ob­vi­ous that scope in­sen­si­tivity is pure fal­lacy. Mike Vas­sar’s sug­ges­tion that “we should con­sider any num­ber of iden­ti­cal lives to have the same util­ity as one life” seems plau­si­ble. An ex­treme ex­am­ple is, what if the uni­verse were pe­ri­odic in the time di­rec­tion so that ev­ery event gets re­peated in­finitely. Would this mean that ev­ery de­ci­sion has in­finite util­ity con­se­quence? It seems to me that, on the con­trary, this would make no differ­ence to the eth­i­cal weight of de­ci­sions. Per­haps some­how the util­ity binds to the in­for­ma­tion con­tent of a set of events. Pre­sum­ably, the to­tal vari­a­tion in ex­pe­riences a puppy can have while be­ing kil­led would be ex­hausted long be­fore reach­ing 3^^^^^3.

Wei, would it be cor­rect to say that, un­der your in­ter­pre­ta­tion, if our uni­verse ini­tially con­tains 100 su­per happy peo­ple, that cre­at­ing one more per­son who is “very happy” but not “su­per happy” is a net nega­tive, be­cause the “mea­sure” of all the 100 su­per happy peo­ple gets slightly dis­counted by this new per­son?

It’s hard to see why I would con­sider this the right thing to do—where does this mys­te­ri­ous “mea­sure” come from?

If you be­lieve in the many wor­lds in­ter­pre­ta­tion of quan­tum me­chan­ics, you have to dis­count the util­ity of each of your fu­ture selves by his mea­sure, in­stead of treat­ing them all equally. The ob­vi­ous gen­er­al­iza­tion of this idea is for the al­tru­ist to dis­count the util­ity he as­signs to other peo­ple by their mea­sures, in­stead of treat­ing them all equally.

But in­stead of us­ing the QM mea­sure (which doesn’t make sense “out­side the Ma­trix”), let the mea­sure of each per­son be in­versely re­lated to his al­gorith­mic com­plex­ity (his per­sonal al­gorith­mic com­plex­ity, which is equal to the al­gorith­mic com­plex­ity of his uni­verse plus the amount of in­for­ma­tion needed to lo­cate him within that uni­verse), and the prob­lem is solved. The util­ity of a Tur­ing ma­chine can no longer grow much faster than its prior prob­a­bil­ity shrinks, since the sum of mea­sures of peo­ple com­puted by a Tur­ing ma­chine can’t be larger than its prior prob­a­bil­ity.

Utility func­tions have to be bounded ba­si­cally be­cause gen­uine mar­t­in­gales screw up de­ci­sion the­ory—see the St. Peters­burg Para­dox for an ex­am­ple.

Economists, statis­ti­ci­ans, and game the­o­rists are typ­i­cally happy to do so, be­cause util­ity func­tions don’t re­ally ex­ist—they aren’t uniquely de­ter­mined from some­one’s prefer­ences. For ex­am­ple, you can mul­ti­ply any util­ity func­tion by a con­stant, and get an­other util­ity func­tion that pro­duces ex­actly the same ob­serv­able be­hav­ior.

I always won­dered why peo­ple be­lieve util­ity func­tions are U(x): R^n → R^1 for some n. I’m no de­ci­sion the­o­rist, but I see no rea­son util­ities can’t func­tion on the ba­sis of a par­tial or­der­ing rather than a to­tally or­dered nu­mer­i­cal func­tion.

I’m no de­ci­sion the­o­rist, but I see no rea­son util­ities can’t func­tion on the ba­sis of a par­tial or­der­ing rather than a to­tally or­dered nu­mer­i­cal func­tion.

The to­tal or­der­ing is re­ally nice be­cause it means we can move from the messy world of out­comes to the neat world of real num­bers, whose val­ues are prob­a­bil­is­ti­cally rele­vant. If we move from to­tal or­der­ing to par­tial or­der­ing, then we are no longer able to make prob­a­bil­is­tic judg­ments based only on the util­ities.

If you have some mul­ti­di­men­sional util­ity func­tion, and a way to de­ter­mine your prob­a­bil­is­tic prefer­ences be­tween any un­cer­tain gam­ble be­tween out­comes x and y and a cer­tain out­come z, then I be­lieve you should be able to find the real func­tion that ex­presses those prob­a­bil­is­tic prefer­ences, and that’s your uni­di­men­sional util­ity func­tion. If you don’t have that way to de­ter­mine your prefer­ences, then you’ll be in­de­ci­sive, which is not some­thing we like to build in to our de­ci­sion the­o­ries.

Give me five dol­lars, or I will kill as many pup­pies as it takes to make you. And they’ll go to hell. And there in that hell will be fire, brim­stone, and rap with En­gr­ish lyrics.

I think the prob­lem is not Solomonoff in­duc­ton or Kol­mogorov com­plex­ity or Bayesian ra­tio­nal­ity, what­ever the differ­ence is, but you. You don’t want an AI to think like this be­cause you don’t want it to kill you. Mean­while, to a true al­tru­ist, it would make perfect sense.

Not re­ally con­fi­dent. It’s ob­vi­ous that no so­ciety of self­ish be­ings whose mem­bers think like this could func­tion. But they’d still, ab­surdly, be hap­pier on av­er­age.

Well, in that case, one pos­si­ble re­sponse is for me to kill YOU (or re­port you to the po­lice who will ar­rest you for threat­en­ing mass an­i­mal cru­elty). But if you’re re­ally a su­per-in­tel­li­gent be­ing from be­yond the simu­la­tion, then try­ing to kill you will in­evitably fail and prob­a­bly cause those 3^^^^3 peo­ple to suffer as a re­sult.

(The most plau­si­ble sce­nario in which a Pas­cal’s Mug­ging oc­curs? Our simu­la­tion is be­ing tested for its co­her­ence in ex­pected util­ity calcu­la­tions. Fail the test and the simu­la­tion will be ter­mi­nated.)

Mitchell, I don’t see how you can Pas­cal-mug your­self. Tom is right that the pos­si­bil­ity that typ­ing QWERTYUIOP will de­stroy the uni­verse can be safely ig­nored; there is no ev­i­dence ei­ther way, so the prob­a­bil­ity equals the prior, and the Solomonoff prior that typ­ing QWERTYUIOP will save the uni­verse is, as far as we know, ex­actly the same. But the mug­ger’s threat is a shred of Bayesian ev­i­dence that you have to take into ac­count, and when you do, it mas­sively tips the ex­pected util­ity bal­ance. Your sug­gested solu­tion does seem right but ut­terly in­tractable.

I don’t think the QWERTYUIOP thing is liter­ally zero Bayesian ev­i­dence ei­ther. Sup­pose the thought of that par­tic­u­lar pos­si­bil­ity was man­u­ally in­serted into your mind by the simu­la­tion op­er­a­tor.

I have a very poor un­der­stand­ing of both prob­a­bil­ity and an­a­lytic philos­o­phy so in the in­evitable sce­nario where I’m com­pletely wrong be kind.

But if you can con­ceive of a sce­nario where there’s a prob­a­bil­ity that do­ing some­thing will re­sult in in­finite gain, but you can also pic­ture an equally prob­a­ble sce­nario where do­ing NOTHING will re­sult in equal gain, then don’t they can­cel each other out?

If there’s a prob­a­bil­ity that be­liev­ing in god will give you in­finite gain, isn’t there an equal prob­a­bil­ity that not be­liev­ing in god will re­sult in in­finite gain?

So if the only merit to a sce­nario is that some­one came up with it it can be coun­tered with a con­tra­dict­ing sce­nario that some­one came up with. There are an in­finite amount of claims with no ev­i­dence to sup­port them that all have a finitely small prob­a­bil­ity, but ev­ery sin­gle one of those claims has a con­tra­dict­ing claim with equal prob­a­bil­ity.

So only be­liefs with ev­i­dence to sup­port them should be con­sid­ered, be­cause only those be­liefs don’t have a con­tra­dict­ing be­lief with an equal prob­a­bil­ity.

So isn’t pas­cals wa­ger pretty stupid? If there’s an in­finite gain in be­liev­ing in god there’s also an in­finite gain in not be­liev­ing in god. The prob­a­bil­ity is equal to an in­finite amount of con­tra­dict­ing prob­a­bil­ities, there­fore non-ex­is­tent.

Now please tell me how I’m wrong so I can stop hav­ing a false sense of ac­com­plish­ment.

That said, Shield’s ques­tion is not whether ac­cord­ing to Pas­cal’s Wager that sym­met­ric prob­a­bil­ity ex­ists.

If (as Shield sug­gests) “only be­liefs with ev­i­dence to sup­port them should be con­sid­ered, be­cause only those be­liefs don’t have a con­tra­dict­ing be­lief with an equal prob­a­bil­ity,” then ac­cept­ing the posited asym­me­try with­out ev­i­dence is an er­ror.

There is cer­tainly ev­i­dence to sup­port the ex­is­tence of god (God, a god, gods, etc.) Most peo­ple around here don’t find it con­vinc­ing but billions of peo­ple around the globe do.

Per­haps the is­sue should be for­mu­lated in the form of the bal­ance of ev­i­dence for the propo­si­tion A as com­pared to ev­i­dence for not-A. How­ever this would lead you to prob­a­bil­ity-weighted out­comes and the usual me­chan­ics which Pas­cal’s Wager sub­verts by drop­ping an in­finity into them.

All in all, the ob­jec­tion “But ab­sence of God could sym­met­ri­cally lead to eter­nal life as well” to Pas­cal’s Wager doesn’t look ap­peal­ing to me.

There is cer­tainly ev­i­dence to sup­port the ex­is­tence of god (God, a god, gods, etc.) Most peo­ple around here don’t find it con­vinc­ing but billions of peo­ple around the globe do.

We’re not talk­ing about the ex­is­tence of god. You’re for­get­ting the law of bur­den­some de­tail.

Pas­cals wa­ger doesn’t posit that God ex­ists, it posits that God ex­ists and he’ll give us eter­nal joy if we be­lieve in him.

The claim god ex­ists has an above neg­ligible prob­a­bil­ity, the claim god will give you eter­nal joy, but only if you be­lieve in him has no ab­solutely no ev­i­dence to sup­port it, and is there­fore equal to the claim god will give you eter­nal joy, but only if you don’t be­lieve in him.

If a God ex­ists, since he hasn’t given us any in­di­ca­tion of any of his char­ac­ter­is­tics (if you feel oth­er­wise please ar­gue), we have no ev­i­dence to in­di­cate he’d do ei­ther.

Hell I find it more prob­a­ble that an in­tel­li­gent de­ity would re­ward us for con­clud­ing he didn’t ex­ist since that’s by far the most prob­a­ble ver­sion of re­al­ity as de­ter­mined by the ev­i­dence at hand. He’d have to be malev­olent to re­ward us for be­liev­ing in him if this is the ev­i­dence he gives for his ex­is­tence, and there isn’t any ev­i­dence for that ei­ther. Maybe life is a test and you win if you re­al­ize that based on available ev­i­dence the ex­is­tence of god isn’t suffi­ciently likely to claim he ex­ists cough sar­casm cough. This is of course as­sum­ing vaguely hu­man mo­ti­va­tion and val­ues.

There isn’t any ev­i­dence to in­di­cate ei­ther, the point of pas­cals wa­ger seems to be that a finitely small prob­a­bil­ity mul­ti­plied by an in­finite gain is cause for mo­ti­va­tion, but this is un­true if that claim is equally true to any made up con­tra­dict­ing claim.

I would in­stead break it down into the claim that some Force could the­o­ret­i­cally give us eter­nal bliss or suffer­ing (A), and the fur­ther set of com­pli­cated claims in­volved in Pas­cal’s brand of Chris­ti­an­ity.

Con­di­tional on A: the fur­ther claim that re­li­gion would pre­vent us from us­ing the Force in the way we’d pre­fer seems vastly more plau­si­ble to me, based on the ev­i­dence, than Pas­cal’s al­ter­na­tive. And there are var­i­ous other pos­si­bil­ities we’d have to con­sider. I don’t be­lieve the Wager style of ar­gu­ment works, for the rea­sons given or al­luded to in the OP—but if it worked I be­lieve it would ar­gue for athe­ism.

The claim god ex­ists has an above neg­ligible prob­a­bil­ity, the claim god will give you eter­nal joy, but only if you be­lieve in him has no ab­solutely no ev­i­dence to sup­port it

I am not quite sure how do you rec­on­cile the former and the lat­ter parts of this sen­tence.

So you think there’s some cred­ible ev­i­dence for god’s ex­is­tence but ab­solutely none, zero, zilch, nada ev­i­dence for the claim that god can give you eter­nal life and that be­liev­ing in him in­creases your chances of re­ceiv­ing it?

If a God ex­ists, since he hasn’t given us any in­di­ca­tion of any of his char­ac­ter­is­tics (if you feel oth­er­wise please ar­gue)

Of course he did. There is a large vol­ume of sa­cred liter­a­ture in most cul­tures which deals pre­cisely with char­ac­ter­is­tics of gods. A large chunk of it claims to be rev­e­la­tory and have di­v­ine ori­gin.

I am not quite sure how do you rec­on­cile the former and the lat­ter parts of this sen­tence.

I am not quite sure why I would have is­sue. Above neg­ligible in this case means any prob­a­bil­ity above that of a com­pletely ran­dom un­falsifi­able hy­poth­e­sis with no ev­i­dence to sup­port it.

So you think there’s some cred­ible ev­i­dence for god’s ex­is­tence but ab­solutely none, zero, zilch, nada ev­i­dence for the claim that god can give you eter­nal life and that be­liev­ing in him in­creases your chances of re­ceiv­ing it?

No, and there’s perfectly valid ev­i­dence to be­lieve he wants us to not be­lieve in him. Of course that isn’t ac­tu­ally any ev­i­dence of a re­ward or an af­ter­life, nor would ev­i­dence that he wants us to be­lieve in him be.

The cur­rent ev­i­dence at hand only in­di­cates that God doesn’t care about whether or not we be­lieve in his ex­is­tence, as god is om­nipo­tent and could just give us ACTUAL ev­i­dence to con­vince ev­ery­one of his ex­is­tence, which doesn’t ex­ist.

Of course he did. There is a large vol­ume of sa­cred liter­a­ture in most cul­tures which deals pre­cisely with char­ac­ter­is­tics of gods. A large chunk of it claims to be rev­e­la­tory and have di­v­ine ori­gin.

This isn’t ev­i­dence. There’s an equal prob­a­bil­ity of peo­ple writ­ing these things in uni­verses where there is no God and uni­verses where there is a God. This is of course an es­ti­ma­tion, we haven’t seen what these texts look like in an uni­verse with a god com­pared to re­li­gious texts in an uni­verse with­out a god, nor the amount of them, so the texts we have don’t ac­tu­ally in­di­cate any­thing about the ex­is­tence of god.

The ab­solutely only differ­ence be­tween a re­li­gious text and a ran­dom hy­poth­e­sis with no ev­i­dence to sup­port it is that a re­li­gious text is a ran­dom hy­poth­e­sis with no ev­i­dence to sup­port it that some­one wrote down.

There isn’t any­thing in these texts to im­ply di­v­ine ori­gin. They’re full of log­i­cal er­rors, sci­en­tific er­rors, and they con­tra­dict them­selves in­ter­nally and among each other.

And if say, the bible was ac­tu­ally of di­v­ine ori­gins, as it is full of log­i­cal er­rors, con­tra­dic­tions and sci­en­tific er­rors it would only in­di­cate that god doesn’t want us to be­lieve in him, which is what you’re try­ing to prove in the first place.

I’ve only made ar­gu­ments I think are cor­rect in re­sponse to points that you made. If I have offended you, that was cer­tainly not the in­tent and you can point to where you think I was rude.

But this is a the­olog­i­cal ar­gu­ment. If you did not want to start a the­olog­i­cal ar­gu­ment, then why did you start a the­olog­i­cal ar­gu­ment?

What is your point?

The origi­nal is­sue was whether you have dis­cov­ered a new failure mode in Pas­cal’s Wager (be­sides a few well-known ones). My view on that re­mains un­changed.

“The origi­nal is­sue”? Were still talk­ing about the same is­sue. Whether or not there’s ev­i­dence to sug­gest that a god would do these things is an in­te­gral part of Pas­cals wa­ger, aka the thing we’ve been talk­ing about for 5 posts, and it’s the only point you’ve made against my ar­gu­ment.

And in dis­cus­sion it’s cus­tom­ary to ex­plain why your view hasn’t changed. If my logic isn’t in­cor­rect, it is ob­vi­ously cor­rect, and it would be nice of you to ex­plain why you think it isn’t, in­stead of just offhand­edly dis­miss­ing me with­out ex­pla­na­tion.

Not once in my life have I had these de­bates (no, not ex­ag­ger­at­ing) and I find it a strange as­sump­tion that I have. Don’t spend an im­mense amount of time on these sort of fo­rums ya’ see.

If this sort of de­bate is truly so scripted could you point me to one? Since I’d gain an equal amount of in­for­ma­tion, ap­par­ently.

I do ac­tu­ally want to know what the ap­par­ently so com­mon chris­tian re­ply to these ar­gu­ments is, it’s sort of why I asked. I’m here to get in­for­ma­tion, not to be told that the in­for­ma­tion has already been given. This fact doesn’t re­ally help me.

I do ac­tu­ally want to know what the ap­par­ently so com­mon chris­tian re­ply to these ar­gu­ments is

Find a smart Chris­tian and talk to her.

You could also think about what is ev­i­dence and what is ideas in your mind about what God (ac­cord­ing to your con­ve­nient defi­ni­tion of him) must do or can­not do. There’s a big differ­ence. You might con­sider meme prop­a­ga­tion and ru­mi­nate on why cer­tain writ­ten down “ran­dom hy­pothe­ses” be­come re­li­gions and take over the world, while oth­ers don’t. Oh, and spec­u­la­tions about the prob­a­bil­ities of things hap­pen­ing in uni­verses with gods and uni­verses with­out gods are nei­ther facts nor ar­gu­ments.

This isn’t ev­i­dence. There’s an equal prob­a­bil­ity of peo­ple writ­ing these things in uni­verses where there is no God and uni­verses where there is a God. This is of course an es­ti­ma­tion, we haven’t seen what these texts look like in an uni­verse with a god com­pared to re­li­gious texts in an uni­verse with­out a god, nor the amount of them, so the texts we have don’t ac­tu­ally in­di­cate any­thing about the ex­is­tence of god.

Well, I’d ex­pect more texts in a uni­verse with a God. Where on earth are you get­ting this “equal prob­a­bil­ity”?

“Pas­cal’s” Mug­ging re­quires me to be­lieve that the ap­par­ent uni­verse that we oc­cupy, with its very low in­for­ma­tion con­tent, is in fact merely part of a much larger pro­gram (in a causally linked and so in­com­press­ible way) which ad­mits calcu­la­tion within it of a spe­cially de­signed (high-in­for­ma­tion con­tent) uni­verse with 3^^^^3 peo­ple (and not, say, as a side-effect of a low-in­for­ma­tion simu­la­tion that also com­putes other pos­si­bil­ities like giv­ing im­mense life and joy to com­pa­rable num­bers of peo­ple). The odds of that, if we use the speed pri­ors, would seem to be 2^-(bits de­scribing our uni­verse + num­ber of in­struc­tions to com­pute it):2^-(bits de­scribing that vastly larger uni­verse + num­ber of in­struc­tions to com­pute it). That’s go­ing to be a min­i­mum of 1:-2^(O(3^^^^3)), so by the speed prior this par­tic­u­lar kind of prob­a­bil­ity falls away hugely faster than the util­ity grows.

How­ever, I have lit­tle doubt that some cre­ative philoso­pher can find some way to res­cue the mug­ging ar­gu­ment in slightly differ­ent form.

Looks like strate­gic think­ing to me. If you are to or­ga­nize your­self to be prone to be Pas­cal-mugged, you will get Pas­cal mugged, and thus it is ir­ra­tional to or­ga­nize your­self to be Pas­cal-mug­gable.

edit:
It is as ra­tio­nal to in­tro­duce cer­tain bounds on ap­pli­ca­tions of own rea­son­ing as it is to try to build re­li­able, non-crash­ing soft­ware, or to im­pose sim­ple rule of thumb limits on the out­put of the soft­ware that con­trols po­si­tion­ing of con­trol rods in the nu­clear re­ac­tor.

If you prop­erly con­sider a tiny prob­a­bil­ity of mis­take to your rea­son­ing, a mis­take that may lead to con­sid­er­a­tion of a num­ber gen­er­ated by a ran­dom string—a lot of such num­bers are ex­tremely huge—and ap­ply some meta-cog­ni­tion with re­gards to ap­pear­ance of such num­bers, you’ll find that such ex­tremely huge num­bers are also dis­pro­por­tion­ally rep­re­sented in prod­ucts of er­rors in rea­son­ing.

With re­gards to the wa­ger, there is my an­swer:
If you see some­one bend over back­wards to make a nickel, it is prob­a­bly not War­ren Buffett you’re see­ing. In­deed the prob­a­bil­ity of that per­son who’s bend­ing over back­wards to make a nickel, hav­ing N$, would sharply fall off with in­crease of N. Here you see a be­ing that is mug­ging you, and he allegedly has the power to simu­late 3^^^^3 be­ings that he can mug, have sex­ual re­la­tions with, tor­ture, what ever. The larger is the claim, the less prob­a­ble it is that this is a hon­est situ­a­tion.

It is how­ever ex­ceed­ingly difficult to for­mal­ize such an­swer or to ar­rive at it in a for­mal fash­ion. And for me, there could ex­ist other wa­gers that are be­yond my ca­pa­bil­ity to rea­son cor­rectly about.

For this rea­son as mat­ter of policy I as­sume that I have an er­ror per each in­fer­ence step—the er­ror that can re­sult in con­sid­er­a­tion of an ex­tremely huge num­ber—and have an up­per cut off on the num­bers i’d use for con­sid­er­a­tions as an op­ti­miza­tion strat­egy; if there is a huge num­ber of this sort, more ver­ifi­ca­tion steps are needed. In par­tic­u­lar, this has very high im­pact on moral­ity on me. Any sort of situ­a­tion where you are kil­ling fewer peo­ple to save more peo­ple—those situ­a­tions are ex­tremely un­com­mon and difficult to con­jec­ture—the ap­pear­ance of such situ­a­tion how­ever can eas­ily re­sult from faulty rea­son­ing.

Maybe I’m miss­ing the point here, but why do we care about any num­ber of simu­lated “peo­ple” ex­ist­ing out­side the ma­trix at all? Even as­sum­ing that such peo­ple ex­ist, they’ll never effect me, nor effect any­one in the world I’m in. I’ll never speak to them, they’ll never speak to any­one I know and I’ll never have to deal with any con­se­quences for their deaths. There’s no ex­pec­ta­tion that I’ll be pun­ished or shunned for not car­ing about peo­ple from out­side the ma­trix, nor is there any way that these peo­ple could ever break into our world and at­tempt to pun­ish me for kil­ling them. As far as I care, they’re not real peo­ple and their deaths or non-deaths do not fac­tor into my util­ity func­tion at all. Un­less Pas­cal’s mug­ger claims he can use his pow­ers from out­side the ma­trix to cre­ate 3^^^3 peo­ple in our world (the only one I care about) and then kill them here, my judge­ment is based soley on that fact that U(me loos­ing 5$) < 0.

So, let’s as­sume that we’re ask­ing about the more in­ter­est­ing case and say that Pas­cal’s mug­ger is in­stead threat­en­ing to use his magic ex­tra-ma­trix pow­ers to cre­ate 3^^^3 peo­ple here on Earth one by one and that they’ll each go on in­ter­na­tional tele­vi­sion and de­nounce me for be­ing a ter­rible per­son and ask me over and over why I didn’t save them and then ex­plode into chunks of gore where ev­ery­one can see it be­fore fad­ing back out of the ma­trix (to avoid black hole con­cerns) and that all of this can be avoided with a sin­gle one time pay­ment of 5$. What then?

I hon­estly don’t know. Even one per­son be­ing cre­ated and kil­led that way definitely feels worse than imag­in­ing any num­ber of peo­ple out­side the ma­trix get­ting kil­led. I’d be tempted on an emo­tional level to say yes and give him the money, de­spite my more in­tel­lec­tual parts say­ing this is clearly a setup and that some­thing that ter­rible isn’t ac­tu­ally go­ing to hap­pen. 3^^^3 peo­ple, while I ob­vi­ously can’t re­ally imag­ine that many, is only worse since it will keep hap­pen­ing over and over and over un­til af­ter the stars burn out of the sky.

The only re­ally con­vinc­ing ar­gu­ment, aside from the ar­gu­ment from ab­sur­dity (“That’s stupid, he’s just a philoso­pher out try­ing to make a quick buck.”) is Polymeron’s ar­gu­ment here

Re­place “ma­trix” with “light cone” and see if you would still en­dorse that.

they’ll each go on in­ter­na­tional tele­vi­sion and de­nounce me for be­ing a ter­rible per­son and ask me over and over

There’s not enough time.

I’d be tempted on an emo­tional level to say yes and give him the money

If I ever as­cend into diety­hood I’ll be sorely tempted to go around offer­ing Pas­cal’s wa­ger in var­i­ous forms and in­vert­ing the con­se­quences from what i said, or hav­ing no other con­se­quences no mat­ter what they do.

“Ac­cept Chris­ti­an­ity, and have a chance of heaven.” (Create heaven only for those who de­cline my Wager.)

“Give me five dol­lars, or I will tor­ture some peo­ple” (Only tor­ture peo­ple if they give me five dol­lars.)

Check the mul­ti­verse to see how many be­ings will threaten peo­ple with Pas­cal’s wa­ger. Create 3^^(a large num­ber of up ar­rows)^^^3 unique be­ings for each philoso­pher con man. Ask each new be­ing: “Give me five dol­lars, or I will tor­ture some peo­ple” (Do noth­ing. Let them live out nor­mal lives with the benefit of their money. [Don’t worry, for each such re­ply I will add five dol­lars worth of goods and ser­vices to their world, to avoid defla­tion and re­lated is­sues.])

Why ev­ery­one is as­sum­ing the prob­a­bil­ity they are con­fronting a trick­ster test­ing them is zero, or that it is in any case a smaller prob­a­bil­ity than some­thing differ­ent that they can’t get a han­dle on be­cause it is too small, I have no idea.

Since peo­ple are so taken with only tak­ing be­ings at their word, wouldn’t a be­ing tel­ling them it will trick them if it gets the power con­found them?

Chang­ing “ma­trix” to “light cone changes lit­tle, since I still don’t ex­pect to ever in­ter­act with them. The light cone ex­am­ple is only differ­ent in­so­far as I ex­pect more peo­ple in my light cone to (ir­ra­tionally) care about peo­ple be­yond it. That might cause me to make some to­ken efforts to hide or ex­cuse my ap­a­thy to­wards the 3^^^3 lives lost, but not to the same de­gree as even 1 life lost here in­side my light cone.

If you ac­cept that some­one mak­ing a threat in the form of “I will do X un­less you do Y” is ev­i­dence for “they will do X un­less you do ~Y”, then by the prin­ci­ple of con­ser­va­tion of ev­i­dence, you have ev­i­dence that ev­ery­one who ISN’T mak­ing a threat in the form of “I will do X un­less you do Y” will do X un­less you Y. For all val­ues of X and Y that you ac­cept this trick­ster hy­poth­e­sis for. And that is ab­surd.

I think the prob­lem might lie in the al­most laugh­able dis­par­ity be­tween the price and the pos­si­ble risk. A hu­man mind is not ca­pa­ble of in­stinc­tively pro­vid­ing a rea­son why it would be worth kil­ling 3^^^^3 peo­ple—or even, I think, a mil­lion peo­ple—as pun­ish­ment for not get­ting $5. A mind who would value $5 as much or more than the lives of 3^^^^3 peo­ple is ut­terly alien to us, and so we leap to the much more likely as­sump­tion that the guy is crazy.

Is this a bias? I’d call it a heuris­tic. It calls to my mind the dis­cus­sion in Neal Stephen­son’s Anathem about pink nerve-gas-fart­ing drag­ons. (Manda­tory warn­ing: fic­tional ex­am­ple.) The crux of it is, our minds only bother to an­ti­ci­pate situ­a­tions that we can con­ceive of as log­i­cal. There­fore, the man­i­fest illog­i­cal­ity of the mug­ging (why is 3^^^^3 lives worth $5; if you’re a Ma­trix Lord why can’t you just gen­er­ate $5 or bet­ter yet, mod­ify my mind so that I’m in­clined to give you $5, etc.) causes us to anti-an­ti­ci­pate its truth. Other­wise, what’s to stop you from imag­in­ing, as stated by Tom_McCabe2 (and mitchell_porter2, &c.), that typ­ing the string “QWERTYUIOP” leads to, for ex­am­ple, 3^^^^3 deaths? If you imag­ine it, and con­ceive of it as a log­i­cally pos­si­ble out­come,
then re­gard­less of its im­prob­a­bil­ity, by your ar­gu­ment (as I see it), a “mind that worked strictly by Solomonoff in­duc­tion” should cease to type that string of let­ters ever again. By in­duc­tion, such a mind could cause it­self to cease to take any ac­tion, which would lead to… well, if the AI had ac­cess to it­self, likely self-dele­tion.

That’s my top-of-the-head the­ory. It doesn’t re­ally an­swer the ques­tion at hand, but maybe I’m on the right track...?

It seems like this may be an­other facet of the prob­lem with our mod­els of ex­pected util­ity in deal­ing with very large num­bers. For in­stance, do you ac­cept the Repug­nant con­clu­sion?

I’m at a loss for how to model ex­pected util­ity in a way that doesn’t gen­er­ate the re­pug­nant con­clu­sion, but my sus­pi­cion is that if some­one finds it, this prob­lem may go away as well.

Or not. It seems that our var­i­ous heuris­tics and bi­ases against hav­ing cor­rect in­tu­itions about very large and small num­bers are di­rectly tied up in pro­duc­ing a limit­ing frame­work that acts as a con­ser­va­tive.

One thought, the ex­pected util­ity of let­ting our god-like figure run this Tur­ing simu­la­tion might well be pos­i­tive! S/​He is es­sen­tially cre­at­ing these 3^^^3 peo­ple and then kil­ling them. And in fact, it’s rea­son­able to as­sume that ex­pected di­su­til­ity of kil­ling them is en­tirely de­pen­dent on (and thus ex­actly bal­anced by) the util­ity of their cre­ation.

So, our mug­ger doesn’t re­ally hand us a dilemma un­less the claim is that this simu­la­tion is already run­ning, and those peo­ple have lives worth liv­ing, but if you don’t pay the $5, the pro­gram will be al­tered (sun will stop in the sky, so tto speak) and they will all be kil­led). This last is more of a nit­pick.

It does seem to me that the bayesian in­fer­ence we draw from this per­son’s state­ment must be ex­traor­di­nar­ily low, with an un­cer­tainty much larger than its ab­solute value. Be­cause a be­ing which is both ca­pa­ble of this and will­ing to offer such a wa­ger (ei­ther in truth or as a test) is deeply be­yond our moral or in­tel­lec­tual com­pre­hen­sion. In­deed, if the claim is true, that fact will have util­ity im­pli­ca­tions that com­pletely dwarf the im­me­di­ate de­ci­sion. If they are will­ing to do this much over 5 dol­lars, what will they do for a billion? Or for some end that money can­not nor­mally pur­chase? Or merely at whim? It seems that the in­for­ma­tion we re­ceive by failing to pay may be of value com­men­su­rate with the di­su­til­ity of them truth­fully car­ry­ing out their threat.

Read http://​​www.space­andgames.com/​​?p=22 if you haven’t already. Your util­ity func­tion should not be as­sign­ing things ar­bi­trar­ily large ad­di­tive util­ities, or else you get pre­cisely this prob­lem (if pigs qual­ify as minds, use rocks), and your func­tion will sum to in­finity. If you “kill” by de­stroy­ing the ex­act same in­for­ma­tion con­tent over and over, it doesn’t seem to be as bad, or even bad at all. If I made a mil­lion iden­ti­cal copies of you, froze them into com­plete sta­sis, and then shot 999,999 with a cry­on­ics-proof Su­per-Plasma-Va­por­izer, would this be im­moral? It would cer­tainly be less im­moral than kil­ling a mil­lion or­di­nary in­di­vi­d­u­als, at least as far as I see it.

Michael, your pig ex­am­ple threw me into a great fit of belly-laugh­ing. I guess that’s what my mind look likes when it ex­plodes. And I re­call that was Marvin Min­sky’s pre­dic­tion in So­ciety of Minds.

Don’t dol­lars have an in­finite ex­pected value (in hu­man lives or util­ity) any­way, es­pe­cially if you take into ac­count weird low-prob­a­bil­ity sce­nar­ios? Maybe the next mug­ger will make even big­ger threats.

Pas­cal’s wa­ger type ar­gu­ments fail due to their sym­me­try (which is pre­served in finite cases).

Even if our pri­ors are sym­met­ric for equally com­plex re­li­gious hy­pothe­ses, our pos­te­ri­ors al­most cer­tainly won’t be. There’s too much ev­i­dence in the world, and too many strong claims about these mat­ters, for me to imag­ine that pos­te­ri­ors would come out even. Be­sides, even if two re­li­gions are equally prob­a­ble, there may be cer­tainly be non-epistemic rea­sons to pre­fer one over the other.

How­ever, if af­ter chug­ging through the math, it didn’t bal­ance out and still the ex­pected di­su­til­ity from the ex­is­tance of the di­su­til­ity threat was greater, then per­haps al­low­ing one­self to be vuln­er­a­ble to such threats is gen­uinely the cor­rect out­come, how­ever coun­ter­in­tu­itive and ab­surd it would seem to us.

I agree. If we re­ally trust the AI do­ing the com­pu­ta­tions and don’t have rea­son to think that it’s bi­ased, and if the AI has con­sid­ered all of the points that have been raised about the fu­ture con­se­quences of show­ing one­self vuln­er­a­ble to Pas­calian mug­gings, then I feel we should go along with the AI’s con­clu­sion. 3^^^^3 peo­ple is too many to get wrong, and if the prob­a­bil­ities come out asym­met­ric, so be it.

Maybe the ori­gin of the para­dox is that we are ex­tend­ing the prin­ci­ple of max­i­miz­ing ex­pected re­turn be­yond its do­main of ap­pli­ca­bil­ity.

In ad­di­tion to a fre­quency ar­gu­ment, one can in some cases make a differ­ent ar­gu­ment for max­i­miz­ing ex­pected value even in one-time-only sce­nar­ios. For in­stance, if you knew you would be­come a ran­domly se­lected per­son in the uni­verse, and if your only goal was to avoid be­ing mur­dered, then min­i­miz­ing the ex­pected num­ber of peo­ple mur­dered would also min­i­mize the prob­a­bil­ity that you per­son­ally would be mur­dered. Un­for­tu­nately, ar­gu­ments like this make the as­sump­tion that your util­ity func­tion on out­comes takes only one of two val­ues (“good,” i.e., not mur­dered, and “bad,” i.e., mur­dered); it doesn’t cap­ture the fact that be­ing mur­dered in one way may be twice as bad as be­ing mur­dered in an­other way.

Eliezer> Is the value of my ex­is­tence steadily shrink­ing as the uni­verse ex­pands and it re­quires more in­for­ma­tion to lo­cate me in space?

Yes, but the value of ev­ery­one else’s ex­is­tence is shrink­ing by the same fac­tor, so it doesn’t dis­turb the prefer­ence or­der­ing among pos­si­ble courses of ac­tions, as far as I can see.

Eliezer> If I make a large uniquely struc­tured ar­row point­ing at my­self from or­bit so that a very sim­ple Tur­ing ma­chine can scan the uni­verse and lo­cate me, does the value of my ex­is­tence go up?

This is a more se­ri­ous prob­lem for my pro­posal, but the con­spicu­ous ar­row also in­creases the val­ues of ev­ery­one near you by al­most the same fac­tor, so again per­haps it doesn’t make as much differ­ence as you ex­pect.

Eliezer> I am skep­ti­cal that this solu­tion makes moral sense, how­ever con­ve­nient it might be as a patch to this par­tic­u­lar prob­lem.

I’m also skep­ti­cal, but I’d say it’s more than just a patch to this par­tic­u­lar prob­lem. Treat­ing ev­ery­one as equals no mat­ter what their mea­sures are, be­sides lead­ing to coun­ter­in­tu­itive re­sults in this “Pas­cal’s Mug­ging” thought ex­per­i­ment, is not even math­e­mat­i­cally sound, since the sum of the small prob­a­bil­ities mul­ti­plied by the vast util­ities do not con­verge to any finite value, no mat­ter what course of ac­tion you choose.

The math­e­mat­ics says that you have to dis­count each per­son’s value by some func­tion, oth­er­wise your ex­pected util­ities won’t con­verge. The only ques­tion is which func­tion. Us­ing the in­verse of a per­son’s al­gorith­mic com­plex­ity seems to lead to in­tu­itive re­sults in many situ­a­tions, but not all.

But I’m also open to the pos­si­bil­ity that this en­tire ap­proach is wrong… Are there other pro­posed solu­tions that make more sense to you at the mo­ment?

First, ques­tions like “if the agent ex­pects that I wouldn’t be able to ver­ify the ex­treme di­su­til­ity, would its util­ity func­tion be such as to ac­tu­ally go through spend­ing the re­sources to cause the un­ver­ifi­able di­su­til­ity?”

That an en­tity with such a util­ity func­tion ex­ists would man­age to stick around long enough in the first place it­self may drop the prob­a­bil­ities by a whole lot.

Per­haps best to re­strict our­selves to the case of the di­su­til­ity be­ing ver­ifi­able, but only af­ter the fact. (Has this agent ever pul­led this soft of thing be­fore? etc..) and that ver­ifi­ca­tion doesn’t open in the pre­sent a causal link al­low­ing for other means of pre­vent­ing the di­su­til­ity. There’s alot go­ing on here.

I’m not sure, but maybe the rea­son­ing would go not so much for the sin­gle spe­cific case, but the pro­cess would rea­son by com­put­ing the ex­pected util­ity of fol­low­ing a rule which would re­sult in it be­ing ut­terly vuln­er­a­ble to any agent that merely claims to be ca­pa­ble of caus­ing bignum units of di­su­til­ity.

Some­thing rea­son­ing along the lines of fol­low­ing such a rule would al­low agents in gen­eral to or­der the pro­cess to cause plenty of di­su­til­ity. And that, in it­self, would seem to have plenty of ex­pected di­su­til­ity.

How­ever, if af­ter chug­ging through the math, it didn’t bal­ance out and still the ex­pected di­su­til­ity from the ex­is­tance of the di­su­til­ity threat was greater, then per­haps al­low­ing one­self to be vuln­er­a­ble to such threats is gen­uinely the cor­rect out­come, how­ever coun­ter­in­tu­itive and ab­surd it would seem to us.

I think that if you con­sider that the chance of a threat to cause a given amount of di­su­til­ity be­ing valid is a func­tion of the amount of di­su­til­ity then the prob­lem mostly goes away. That is, in my ex­pe­rience any threat to cause me X units of di­su­til­ity where X is be­yond some thresh­old is less than 1⁄10 as cred­ible as a threat to cause me 1 unit of di­su­til­ity. If some­one threat­ened to kill an­other per­son un­less I gave them $5000 I would be wor­ried. If they threat­ened to kill 10 poe­ple I would be very slightly less wor­ried. If they threat­ened to kill 1000 peo­ple I would be roughly 10 times less wor­ried. If they threat­ened to kill 1,000,000 peo­ple I wouldn’t pay any at­ten­tion at all. Tak­ing these data points and ex­trap­o­lat­ing I form the heuris­tic that the chance of some­one threat­en­ing me with X units of di­su­til­ity over a thresh­old based on how much they are de­mand­ing and whether I can fulfill that de­mand de­creases faster than lin­early.

[i]Noth­ing could pos­si­bly be that weak.[/​i]

On the con­trary, I think it is not only that weak but ac­tu­ally far weaker. If you are will­ing to con­sider the ex­is­tance of things like 3^^^3 units of di­su­til­ity with­out con­sid­er­ing the ex­is­tence of chances like 1/​4^^^4 then I be­lieve that is the prob­lem that is caus­ing you so much trou­ble.

You don’t need a bounded util­ity func­tion to avoid this prob­lem. It merely has to have the prop­erty that the util­ity of a given con­figu­ra­tion of the world doesn’t grow faster than the length of a min­i­mal de­scrip­tion of that func­tion. (Where “min­i­mal” is rel­a­tive to what­ever sort of bounded ra­tio­nal­ity you’re us­ing.)

It ac­tu­ally seems quite plau­si­ble to me that our in­tu­itive util­ity-as­sign­ments satisfy some­thing like this con­straint (e.g., kil­ling 3^^^^^3 pup­pies doesn’t feel much worse than kil­ling 3^^^^3 pup­pies), though that might not mat­ter much if you think (as I do, and I ex­pect Eliezer does) that our in­tu­itive util­ity-as­sign­ments of­ten need a lot of ad­just­ment be­fore they be­come things a re­ally ra­tio­nal be­ing could sign up to.

It’s definitely clever, but it’s not quite what a Gödel sen­tence for us would be- it would seem to us to be an in­tractable state­ment about some­thing else, and we’d be in­ca­pable of com­pre­hend­ing it as an in­di­rect refer­ence to our pro­cesses of un­der­stand­ing.

So, in par­tic­u­lar, a hu­man be­ing can’t write the Gödel sen­tence for hu­mans.

Ge­niusNZ, you have to con­sider not only all pro­posed gods, but all pos­si­ble gods and re­ward/​pun­ish­ment struc­tures. Since the num­ber and range of con­ceiv­able di­v­ine re­wards and pun­ish­ments is in­finite for each ac­tion, the in­cen­tives are all equally bal­anced, and thus give you no rea­son to pre­fer one ac­tion over an­other.

Ul­ti­mately, I think Tom McCabe is right—the truth of a propo­si­tion de­pends in part on its mean­ingful­ness.

What is the prob­a­bil­ity that the sun will rise to­mor­row? Nearly 1, if you’re think­ing of dawns. Nearly 0, if you’re think­ing of Coper­ni­cus. Bayesian rea­son­ing can eval­u­ate propo­si­tions, but at the limit, one must already have a ra­tio­nal vo­cab­u­lary in which to ex­press hy­pothe­ses.

When some­one threat­ens to kill 3^^^^3 peo­ple, this calls into question

1) whether the as­ser­tion is mean­ingful at all
2) whether the lives in ques­tion are equiv­a­lent to “hu­man lives” already ob­served, or are un­like in kind—in other words, whether they should be val­ued similarly.

After all, analo­gously to the origi­nal Wager’s prob­lem, these 3^^^^3 peo­ple could be of a nega­tive moral value—it could be good to kill them. And no, Pas­cal’s Mug­ger can­not just re­spond that he means peo­ple like you and me, be­cause they are ob­vi­ously not ex­actly analo­gous, since they are un­ob­serv­able.

You could ar­gue that do­ing any ac­tion, such as ac­cept­ing the wa­ger, has a small but much larger than 1/​3^^^3 chance of kil­ling 3^^^3 peo­ple. You could ar­gue that any ac­tion has a small but much larger than 1/​3^^^3 chance of guaran­tee­ing bliss­ful im­mor­tal­ity for 3^^^3 peo­ple. There­fore, de­clin­ing the wa­ger makes a lot more sense be­cause no mat­ter what you do you might have already doomed all those peo­ple.

I think you are over­es­ti­mat­ing the prob­a­bil­ities there: it is only Pas­cal’s Mug­ging if you fail to at­tribute a low enough prob­a­bil­ity to the mug­ger’s claim. The prob­lem, in my opinion, is not how to deal with tiny prob­a­bil­ities of vast util­ities, but how not to at­tribute too high prob­a­bil­ities to events whose prob­a­bil­ities defy our brain’s ca­pac­ity (like “magic pow­ers from out­side the Ma­trix”).

I also feel that, as with Pas­cal’s wa­ger, this situ­a­tion can be mir­rored (and there­fore have the ex­pected util­ities can­celed out) if you sim­ply think “What if he in­tends to kill those peo­ple only if I abide by his de­mand ?”. As with Pas­cal’s wa­ger, the pos­si­bil­ities aren’t only what the wa­ger stipu­lates: when deal­ing with in­finites in de­ci­sion mak­ing (I’m not sure one can say “the prob­a­bil­ity of this event doesn’t over­come the vast util­ity gained” with such num­bers) you prob­a­bly have an­other in­finite which you also can’t eval­u­ate hid­ing be­hind the ques­tion.

I think you are over­es­ti­mat­ing the prob­a­bil­ities there: it is only Pas­cal’s Mug­ging if you fail to at­tribute a low enough prob­a­bil­ity to the mug­ger’s claim. The prob­lem, in my opinion, is not how to deal with tiny prob­a­bil­ities of vast util­ities, but how not to at­tribute too high prob­a­bil­ities to events whose prob­a­bil­ities defy our brain’s ca­pac­ity (like “magic pow­ers from out­side the Ma­trix”).

The prob­lem here is that you’re not “at­tribut­ing” a prob­a­bil­ity; you’re calcu­lat­ing a prob­a­bil­ity through Solomonoff In­duc­tion. In this case, the prob­a­bil­ity is far too low to ac­tu­ally calcu­late, but sim­ple ob­ser­va­tion tells us this much: the Solomonoff prob­a­bil­ity is given by the ex­pres­sion 2^(-Kol­mogorov), which is mere ex­po­nen­ti­a­tion. There’s pretty much no way mere ex­po­nen­ti­a­tion can catch up to four up-ar­rows in Knuth’s up-ar­row no­ta­tion; there­fore, it doesn’t even re­ally mat­ter what the Kol­mogorov com­plex­ity is, be­cause there’s no way it can be nearly as low as 3^^^^3 is high.

All would be well and good if we could sim­ply as­sign prob­a­bil­ities to be what­ever we want; then we could just set the prob­a­bil­ity of Pas­cal’s-Mug­ging-type situ­a­tions as low as we wanted. To an ex­tent, since we’re hu­mans and thus un­able to com­pute the ac­tual prob­a­bil­ities, we still can do this. But para­dox­i­cally enough, as a mind’s com­pu­ta­tional abil­ity in­creases, so too does its sus­cep­ti­bil­ity to these types of situ­a­tions. An AI that is ac­tu­ally able to com­pute/​ap­prox­i­mate Solomonoff In­duc­tion would find that the prob­a­bil­ity is vastly out­weighed by the util­ity gain, which is part of what makes the prob­lem a prob­lem.

I also feel that, as with Pas­cal’s wa­ger, this situ­a­tion can be mir­rored (and there­fore have the ex­pected util­ities can­celed out) if you sim­ply think “What if he in­tends to kill those peo­ple only if I abide by his de­mand ?”. As with Pas­cal’s wa­ger, the pos­si­bil­ities aren’t only what the wa­ger stipu­lates: when deal­ing with in­finites in de­ci­sion mak­ing (I’m not sure one can say “the prob­a­bil­ity of this event doesn’t over­come the vast util­ity gained” with such num­bers) you prob­a­bly have an­other in­finite which you also can’t eval­u­ate hid­ing be­hind the ques­tion.

But do the two pos­si­bil­ities re­ally sum to zero? Th­ese are two differ­ent situ­a­tions we’re talk­ing about here: “he kills them if I don’t abide” ver­sus “he kills them if I do”. If a com­pu­ta­tion­ally pow­er­ful enough AI calcu­lated the prob­a­bil­ities of these two pos­si­bil­ities, will they ac­tu­ally mirac­u­lously can­cel out? The prob­a­bil­ities will likely mostly can­cel, true, but even the small­est re­main­der will still be enough to trig­ger the mon­strous util­ities car­ried by a num­ber like 3^^^^3. If an AI ac­tu­ally car­ries out the calcu­la­tions, with­out any a pri­ori de­sire that the prob­a­bil­ities should can­cel, can you guaran­tee that they will? If not, then the prob­lem per­sists.

Also, your re­mark on in­fini­ties in de­ci­sion-mak­ing is well-taken, but I don’t think it ap­plies here. As large as 3^^^^3 is, it’s nowhere close to in­finity. As such, the sort of prob­lems that in­finite util­ities pose, while in­ter­est­ing in their own right, aren’t re­ally rele­vant here.

You’re only “calcu­lat­ing a prob­a­bil­ity through Solomonoff In­duc­tion” if the prob­a­bil­ity is only af­fected by com­plex­ity. If there are other rea­sons that could re­duce the prob­a­bil­ity, they can re­duce it by more. For in­stance, a ly­ing mug­ger can in­crease his prob­a­bil­ity of be­ing able to ex­tort money from a naive ra­tio­nal­ist by in­creas­ing the size of the pur­ported pay­off, so a large pay­off is bet­ter ev­i­dence for a ly­ing mug­ger than a small pay­off.

Ad­di­tional fac­tors very well may re­duce the prob­a­bil­ity. The ques­tion is whether they re­duce it by enough. Given how enor­mously large 3^^^^3 is, I’m prac­ti­cally cer­tain they won’t. And even if you some­how man­age to come up with a way to re­duce the prob­a­bil­ity by enough, there’s noth­ing stop­ping the mug­ger from sim­ply adding an­other up-ar­row to his claim: “Give me five dol­lars, or I’ll tor­ture and kill 3^^^^^3 peo­ple!” Then your prob­a­bil­ity re­duc­tion will be ren­dered pretty much ir­rele­vant. And then, if you mirac­u­lously find a way to re­duce the prob­a­bil­ity again to ac­count for the enor­mous in­crease in util­ity, the mug­ger will sim­ply add yet an­other up-ar­row. So we see that ad hoc prob­a­bil­ity re­duc­tions don’t work well here, be­cause the mug­ger can always over­come those by mak­ing his num­ber big­ger; what’s needed is a prob­a­bil­ity penalty that scales with the size of the mug­ger’s claim: a penalty that can always re­duce the ex­pected util­ity of his offer down to ~0. Fac­tors in­de­pen­dent of the size of his claim, such as the prob­a­bil­ity that he’s ly­ing (since he could be ly­ing no mat­ter how big or how small his num­ber ac­tu­ally is), are un­likely to ac­com­plish this.

such as the prob­a­bil­ity that he’s ly­ing (since he could be ly­ing no mat­ter how big or how small his num­ber ac­tu­ally is)

He could be ly­ing re­gard­less of the size of the num­ber, but the prob­a­bil­ity that he is ly­ing would still be af­fected by the size of the num­ber. A larger num­ber is more likely to con­vince a naive ra­tio­nal­ist than a smaller num­ber, pre­cisely be­cause be­liev­ing the larger num­ber means be­liev­ing there is more util­ity. This makes larger num­bers more benefi­cial to fake mug­gers than smaller num­bers. So the larger the num­ber, the lower the chance that the mug­ger is tel­ling the truth. This means that chang­ing the size of the num­ber can de­crease the prob­a­bil­ity of truth in a way that keeps pace with the in­crease in util­ity that be­ing true would provide.

(Ac­tu­ally, there’s an even more in­ter­est­ing fac­tor that no­body ever brings up: even gen­uine mug­gers must have a dis­tri­bu­tion of num­bers they are will­ing to use. This dis­tri­bu­tion must have a peak at a finite value, since it is im­pos­si­ble to have an even dis­tri­bu­tion over all num­bers. If the fake mug­ger keeps adding ar­rows, he’s go­ing to go over this peak and a ra­tio­nal­ist’s es­ti­mate that he is tel­ling the truth should go down be­cause of that as well.)

Is this sim­ply one state­ment ? Is Solomonoff com­plex­ity ad­di­tive with mul­ti­ple state­ments that must be true at once ?
Or is it pos­si­ble that we can calcu­late the prob­a­bil­ity as a chain of Solomonoff com­plex­ities, some­thing like:

s1, s2 … etc are the state­ments. You need all of them to be true: magic pow­ers, ma­trix, etc. Are they sim­ply con­sid­ered as one state­ment with one Solomonoff com­plex­ity K = 2^(x) ? Or K1K2… = 2 ^ (x1 + x2 + …) ? Or K1^K2^… = 2^(2^(2^...)) ?

And if it’s con­sid­ered as one state­ment, does sim­ply calcu­lat­ing the prob­a­bil­ity with K1^K2^… solve the prob­lem ?

Point taken on the sum­ma­tion of the pos­si­bil­ities, they might not sum to zero.

Also, does in­vok­ing “magic pow­ers” equal in­vok­ing an in­finite ? It ba­si­cally says noth­ing ex­cept “I can do what I want”

For the most part, when per­son P says, “I will do X,” that is ev­i­dence that P will do X, and the prob­a­bil­ity of P do­ing X in­creases. In­stead, if P has a rep­u­ta­tion for sar­casm, and if P says the same thing, then the prob­a­bil­ity that P will do X de­creases. Clearly, then, our es­ti­ma­tion of P’s po­si­tion in mindspace de­ter­mines weather we in­crease or de­crease the like­li­hood of P’s claims. For the mug­ging situ­a­tion, we might adopt a model where the mug­ger’s claims about very im­prob­a­ble ac­tions in no way af­fect what we ex­pect him to do since we do not have a use­ful es­ti­mate of the mug­ger’s po­si­tion in mindspace—how could we? We can­not as­sume the mug­ger tends to be more hon­est than not, as we can with hu­mans. Ex­pected util­ities bal­ance and can­cel, so I should keep my wallet.

You’re right about this. How­ever the main prob­lem we have here is this:

A com­pactly speci­fied wa­ger can grow in size much faster than it grows in com­plex­ity. The util­ity of a Tur­ing ma­chine can grow much faster than its prior prob­a­bil­ity shrinks.

If the ex­pected util­ity grows pro­por­tional to −2^(2^n) but the prior prob­a­bil­ity de­creases pro­por­tional to 2^(-n) in the com­plex­ity n (mea­sured in Kol­mogorov Com­plex­ity for all I care), then even if the in­for­ma­tion we get from their ut­ter­ance does lead us to have a pos­te­rior differ­ent from the prior, the util­ity goes to -∞ for n go­ing to ∞.

Fair enough, even though I wouldn’t call that my prior but rather my pos­te­rior af­ter up­dat­ing on my be­lief of what their ex­pected util­ity might be.

So you pro­pose that I up­date my prob­a­bil­ity to be pro­por­tional to the in­verse of their ex­pected util­ity?
How do I even be­gin to guess their util­ity func­tion if this is a one-shot in­ter­ac­tion?
How do I dis­t­in­guish be­tween hon­est and dishon­est peo­ple?

Un­der ig­no­rance of the mug­ger’s po­si­tion in mindspace, we then should as­sign the same prob­a­bil­ity to the mug­ger’s claim and the claim’s op­po­site. Then for all n, (n utilons) Pr(mug­ger will cause n utilons) + (-n utilons) Pr(mug­ger will cause -n utilons) = 0. This re­sponse seems to man­age the rate differ­ence be­tween the util­ity and the prob­a­bil­ity.

The ques­tion is not only about their po­si­tion in mindspace. Surely there may be as many pos­si­ble minds (not just hu­mans) which be­lieve they can simu­late 3^^^3 peo­ple and tor­ture them as there those that do not be­lieve so. But this does not mean that there are as many pos­si­ble minds which ac­tu­ally could re­ally do it. So I shouldn’t use a max­i­mum en­tropy prior for my be­lief in their abil­ity to do it, but for my be­lief in their be­lief in their abil­ity to do it!

This is one of those cases where it helps to be a hu­man, be­cause we’re dumb enough that we can’t pos­si­bly calcu­late the true prob­a­bil­ities in­volved, and so the ex­pected util­ities sum to zero in any rea­son­able ap­prox­i­ma­tion of the situ­a­tion, by hu­man stan­dards.

Un­for­tu­nately, a su­per­in­tel­li­gent AI would be able to get a much bet­ter calcu­la­tion out of some­thing like this, and while a .0000000000000001 prob­a­bil­ity might round down to 0 for us lowly hu­mans, an AI wouldn’t round that down. (After all, why should it? Un­like us, it has no rea­son to doubt its ca­pa­bil­ities for calcu­la­tion.) And with enor­mous util­ities like 3^^^^3, even a .0000000000000001 differ­ence in prob­a­bil­ity is too much. The prob­lem isn’t with us, di­rectly, but with the be­hav­ior a hy­po­thet­i­cal AI agent might take. We cer­tainly don’t want our newly-built FAI to sud­denly de­cide to de­vote all of hu­man­ity’s re­sources to serv­ing the first per­son who comes up with the bright idea of Pas­cal’s Mug­ging it.

Why wait un­til some­one wants the money? Shouldn’t the AI try to send 5 Dol­lars to ev­ery­one with a note at­tached read­ing “Here is a trib­ute; please don’t kill a huge num­ber of peo­ple” re­gard­less of whether they ask for it or not?

My prob­lem with this sce­nario is that I’ve never run Solomonoff In­duc­tion, I run ev­i­den­tial­ism. Mean­ing: if a hy­poth­e­sis’s prob­a­bil­ity is equal to its True Prior, I just treat that as equiv­a­lent to “quan­tum foam”, some­thing that ex­ists in my math­e­mat­ics for ease of fu­ture calcu­la­tions but has no real tie to phys­i­cal re­al­ity, and is there­fore dis­missed as equiv­a­lent to prob­a­bil­ity 0.0.

Ba­si­cally, my brain can rea­son about plau­si­bil­ity in terms of pure pri­ors, but prob­a­bil­ity re­quires at least some tiny bit of ev­i­dence one way or the other. In fact, even a very plau­si­ble hy­poth­e­sis, in terms of be­ing so sim­ple that its Solomonoff Prior is, say, 0.75, would make my brain throw a type-er­ror if I tried to bet on it. My pri­ors don’t tell me any­thing about re­al­ity, they’re only a fea­ture of my mind, they just tell me the start­ing point for run­ning ev­i­den­tial up­dates that do cor­re­late with re­al­ity.

But even agents based on Solomonoff In­duc­tion, such as AIXI, are not sub­ject to Pas­cal’s Mug­ging in rea­son­able en­vi­ron­ments.Con­sider this pa­per by Hut­ter. IIUC, the “self-op­ti­miz­ing” prop­erty es­sen­tially im­plies ro­bust­ness against Pas­cal’s Mug­ging.

The prob­lem is our in­tu­ition: the util­ity of hu­man lives grows lin­early with the pop­u­la­tion of hu­mans, while the mes­sage size of the hy­poth­e­sis needed to de­scribe them grows roughly log­a­r­ith­mi­cally. Since we think we climb down the ex­po­nen­tial de­cline in prior prob­a­bil­ity slower than util­ity in­creases, Pas­cal’s Bar­gain sounds fa­vor­able.

This is wrong. A real Solomonoff Hy­poth­e­sis, af­ter all, does not merely say “There are 3^^^3 hu­mans” if there re­ally are 3 ^^^ 3 hu­mans. It de­scribes and pre­dicts each sin­gle hu­man, af­ter all, in de­tail. It’s a hy­poth­e­sis that aims to pre­dict an en­tire uni­verse from one small piece of fairy cake.

And when you have to make pre­dic­tive de­scrip­tions of hu­mans, the summed size of those de­scrip­tions will grow at least lin­early in the num­ber of pur­ported hu­mans. God help you if they start or­ga­niz­ing them­selves into com­pli­cated so­cieties, which are more com­plex than a mere summed set of sin­gle per­sons. Now your util­ity from tak­ing the Bar­gain grows lin­early while your nega­tive ex­po­nent on the prob­a­bil­ity of its be­ing real de­clines lin­early.

The ques­tion then be­comes a sim­ple mat­ter of where your plaus­abil­ity ver­sus util­ity trade­off sits for one hu­man life.

Or in other words, if Pas­cal’s Mug­ger catches you in a back ally, you should de­mand that he start de­scribing the peo­ple he’s threat­en­ing one-by-one.

Or in other words, if Pas­cal’s Mug­ger catches you in a back ally, you should de­mand that he start de­scribing the peo­ple he’s threat­en­ing one-by-one.

By this rea­son­ing, if some­one has their finger on a trig­ger for a nu­clear bomb that will de­stroy a city of a mil­lion peo­ple, and says “give me $1 or I will pick a ran­dom num­ber and de­stroy the city at a 1/​1000000 chance”, you should re­fuse. After all, he can­not de­scribe any of the peo­ple he is plan­ning to kill, so this is equiv­a­lent to kil­ling one per­son at a 1/​1000000 chance, and you could prob­a­bly ex­ceed that chance of kil­ling some­one just by driv­ing a cou­ple of miles to the su­per­mar­ket.

If it’s a mil­lion peo­ple pos­si­bly dy­ing at a one-in-a-mil­lion chance, then in ex­pected-death terms he’s charg­ing me $1 not to kill one per­son. Since I be­lieve hu­man lives are worth more than $1, I should give him the dol­lar and make a “profit”.

Of course, the other is­sue here, and the rea­son we don’t make analo­gies be­tween se­ri­ous mil­i­tary threats and Pas­cal’s Mug­ging, is that in the nu­clear bomb case, there is ac­tual ev­i­dence on which to up­date my be­liefs. For in­stance, the but­ton he’s got his finger on is ei­ther real, or not. If I can see damn well that it’s a plas­tic toy from the lo­cal store, I’ve no rea­son to give him a dol­lar.

So in the case of Soviet Rus­sia threat­en­ing you, you’ve got real ev­i­dence that they might de­liber­ately nuke your cities with a prob­a­bil­ity much higher than one in a mil­lion. In the case of Pas­cal’s Mug­ger, you’ve got a 1⁄1,000,000 chance that the mug­ger is tel­ling the truth at all, and all the other prob­a­bil­ity mass points at the mug­ger be­ing a delu­sion of your badly-coded rea­son­ing al­gorithms.

If it’s a mil­lion peo­ple pos­si­bly dy­ing at a one-in-a-mil­lion chance, and I use the rea­son­ing you used be­fore, be­cause the mug­ger can’t de­scribe the peo­ple he’s threat­en­ing to kill, I shouldn’t treat that as any worse than a threat to kill one per­son at a one-in-a-mil­lion chance.

You mis­con­strue my po­si­tion. I’m not say­ing, “De­scrip­tions are magic!”. I’m say­ing: I pre­fer ev­i­den­tial­ism to pure Bayesi­anism. Mean­ing: if the mug­ger can’t de­scribe any­thing about the city un­der threat, that is ev­i­dence that he is ly­ing.

Which misses the point of the sce­nario, since a real Pas­cal’s Mug­ging is not about a phys­i­cal mug­ger who could ever be ly­ing. It’s about hav­ing a flaw in your own rea­son­ing sys­tem.

But from what I can tell, it speci­fies that AIXI will even­tu­ally rea­son its way out of any finite Pas­cal’s Mug­ging. The higher the hy­poth­e­sized re­ward in the Mug­ging, the longer it will take to con­verge away from the Mug­ging, but each failure of re­al­ity to con­form to the Mug­ging will push down the prob­a­bil­ity of that en­vi­ron­ment be­ing true and thus re­duce the ex­pected value of act­ing ac­cord­ing to it. Asymp­totic con­ver­gence is proven.

I’d also bet that Hut­ter’s for­mal­ism might con­sider large re­wards gen­er­ated for lit­tle rea­son to be not merely com­plex be­cause of “lit­tle rea­son”, but ac­tu­ally to have greater Kol­mogorov Com­plex­ity just be­cause large re­wards are more com­plex than small ones. Pos­si­bly. So there would be a ques­tion of whether the re­ward of a Mug­ging grows faster than its prob­a­bil­ity shrinks, in the limit. Eliezer claims it does, but the ques­tion is whether our con­ve­nient no­ta­tions for very, very, very large num­bers ac­tu­ally im­ply some kind of sim­plic­ity for those num­bers or whether we’re hid­ing com­plex­ity in our brains at that point.

It seems down­right ob­vi­ous that “3” ought be con­sid­ered vastly more sim­ple than “3 ^^^ 3″. How large a Tur­ing Ma­chine does it take to write down the ful­lest ex­pan­sion of the re­cur­sive func­tion for that su­per-ex­po­nen­ti­a­tion? Do we have to ex­pand it out? I would think a com­pu­ta­tional the­ory of in­duc­tion ought make a dis­tinc­tion be­tween com­pu­ta­tions out­putting large num­bers and ac­tual large num­bers, af­ter all.

But from what I can tell, it speci­fies that AIXI will even­tu­ally rea­son its way out of any finite Pas­cal’s Mug­ging. The higher the hy­poth­e­sized re­ward in the Mug­ging, the longer it will take to con­verge away from the Mug­ging, but each failure of re­al­ity to con­form to the Mug­ging will push down the prob­a­bil­ity of that en­vi­ron­ment be­ing true and thus re­duce the ex­pected value of act­ing ac­cord­ing to it. Asymp­totic con­ver­gence is proven.

Well, the Pas­cal’s Mug­ging is­sue es­sen­tially boils down to whether the agent de­ci­sion mak­ing is dom­i­nated by the bias in its prior.Clearly, an agent that has seen lit­tle or no sen­sory in­put can’t have pos­si­bly learned any­thing, and is there­fore dom­i­nated by its bias. What Hut­ter proved is that, for rea­son­able classes of en­vi­ron­ments, the agent even­tu­ally over­comes its bias. There is of course the in­ter­est­ing ques­tion of con­ver­gence speed, which is not ad­dressed in that pa­per.

I’d also bet that Hut­ter’s for­mal­ism might con­sider large re­wards gen­er­ated for lit­tle rea­son to be not merely com­plex be­cause of “lit­tle rea­son”, but ac­tu­ally to have greater Kol­mogorov Com­plex­ity just be­cause large re­wards are more com­plex than small ones. Pos­si­bly. So there would be a ques­tion of whether the re­ward of a Mug­ging grows faster than its prob­a­bil­ity shrinks, in the limit. Eliezer claims it does, but the ques­tion is whether our con­ve­nient no­ta­tions for very, very, very large num­bers ac­tu­ally im­ply some kind of sim­plic­ity for those num­bers or whether we’re hid­ing com­plex­ity in our brains at that point.

Note that in Hut­ter’s for­mal­ism re­wards are bounded be­tween 0 and some r_max. That’s no ac­ci­dent, since if you al­low un­bounded re­wards, the ex­pec­ta­tion can di­verge.Yud­kowsky seems to as­sume un­bounded re­wards. I think that if you tried to for­mal­ize his ar­gu­ment, you would end up at­tempt­ing to com­pare in­fini­ties.If re­wards are bounded, the bias in­tro­duced by the fact that the con­tri­bu­tions to the ex­pec­ta­tions from the tail of the dis­tri­bu­tions don’t ex­actly can­cel out over differ­ent ac­tions is even­tu­ally washed away as more ev­i­dence ac­cu­mu­lates.

It seems down­right ob­vi­ous that “3” ought be con­sid­ered vastly more sim­ple than “3 ^^^ 3″. How large a Tur­ing Ma­chine does it take to write down the ful­lest ex­pan­sion of the re­cur­sive func­tion for that su­per-ex­po­nen­ti­a­tion?

It’s not re­ally very large.

The point is that there are com­putable func­tions that grow faster than ex­po­nen­tial. The Solomonoff prior over nat­u­ral num­bers (or any set of com­putable num­bers with an in­fi­mum and and not a supre­mum) has in­finite ex­pec­ta­tion be­cause of the con­tri­bu­tion of these func­tions.(If the set has nei­ther an in­fi­mum nor a supre­mum, I think that the ex­pec­ta­tion may be finite, pos­i­tively in­finite or nega­tively in­finite de­pend­ing on the choice of the uni­ver­sal Tur­ing ma­chine and the num­ber en­cod­ing)

I’ve been dis­cussing this whole thing on Red­dit, in par­allel, and I think this is the point where I would just give up and say: re­vert to ev­i­den­tial­ism when dis­cussing un­bounded po­ten­tial re­wards. Any hy­poth­e­sis with a plau­si­bil­ity (ie: my quan­tity of be­lief equals its prior, no ev­i­dence ac­cu­mu­lated) rather than a prob­a­bil­ity (ie: pri­ors plus ev­i­dence) nulls out to zero and is not al­lowed to con­tribute to ex­pected-util­ity calcu­la­tions.

(Ac­tu­ally, what does Bayesian rea­son­ing look like if you sep­a­rate pri­ors from ev­i­dence and con­sider an empty set of ev­i­dence to con­tribute a mul­ti­plier of 0.0, thus ex­actly nul­ling out all the­o­ries that con­sist of no ev­i­dence but their pri­ors?)

It seems as though Pas­cal’s mug­ging may be vuln­er­a­ble to the same “pro­fes­sor god” prob­lem as Pas­cal’s wa­ger. With prob­a­bil­ities that low, the differ­ence be­tween P(3^^^^3 peo­ple be­ing tor­tured|you give the mug­ger $5) and P(3^^^^3 peo­ple be­ing tor­tured| you spend $5 on a sand­wich) may not even be calcu­la­ble. It’s also pos­si­ble that the guy is try­ing to de­prive the sand­wich maker of the money he would oth­er­wise spend on the Si­mu­lated Peo­ple Pro­tec­tion Fund.
If you’re go­ing to say that P(X is true|some­one says X is true)>P(X is true|~some­one says X is true) in all cases, then that should ap­ply to Pas­cal’s wa­ger as well; P(Any given untestable god is real|there are sev­eral churches de­voted to it)>P(Any given untestable god is real|it was only ever pro­posed hy­po­thet­i­cally, tongue-in-cheek) and thus P(Pas­cal’s God)>P(pro­fes­sor god).
In this re­spect, I’m not sure how the two prob­lems are differ­ent.

Well, that feels like an ob­vi­ous no. I’m hu­man though, so the ob­vi­ous­ness is very much worth­less.

My thought is to com­pare the EV here with full ev­i­dence weigh­ing and such (in­clud­ing that it’s more likely that any­one would make some other threat, prob­a­bly a more cred­ible one, rather than this) of a policy of deny­ing Pas­cal’s Mug­ging (a few oc­ca­sional very tiny odds of very huge calamity) against a policy of fal­ling for Pas­cal’s Mug­ging.

A policy that gives the money seems like post­ing “Please Pas­cal-mug me!” pub­li­cly on red­dit or a face­book full of ra­tio­nal­ists or some­thing. You’re bound to end up mak­ing the odds cu­mu­la­tively shoot up by hav­ing more in­stances of the mug­ging, in­clud­ing that some­one takes your money and still some­how ex­e­cutes the ma­jor -EV thing they’re threat­en­ing you of. The EV clearly seems bet­ter with a policy of deny­ing mug­ging, doesn’t it?

Let’s move away from the sig­nal­ling and such, so that such a policy does not lead to a larger-than-five-bucks loss. (tho no amount of sig­nal­ling loss ac­tu­ally over­comes the 3^^^3*P(3^^^3)).

As­sume you re­cieve some mild ev­i­dence that is more likely in the case of im­mi­nent -EV sin­gu­lar­ity than an imme­nent +EV sin­gu­lar­ity. Maybe you find a memo float­ing out of some “aban­doned” base­ment win­dow that con­tains a some de­sign notes for a half-assed FAI. Some­thing that up­dates you to­ward the bad case (but only a very lit­tle amount, barely cred­ible). Do you sneak in and microwave the hard drive? Our cur­rent best un­der­stand­ing of an ideal de­ci­sion the­o­rist would.

I tried to make that iso­mor­phic to the meat of the prob­lem. Do you think I got it? We face prob­lems iso­mor­phic to that ev­ery day, and we don’t tend to act on them.

Now con­sider that you ob­serve some re­al­ity-glitch that causes you to con­clude that you are quite sure that you are a boltz­mann brain. At the same time, you see a child that is drown­ing. Do you think a happy thought (best idea if you are a boltz­mann brain, IMO), or move quick to save the child (much bet­ter good idea, but only in the case where the child is real)? I would still try to save the child (I hope), as would an ideal de­ci­sion the­o­rist.

Those two ex­am­ples are equiv­a­lent as far as our ideal de­ci­sion the­o­rist is con­cerned, but have op­po­site in­tu­ition. Who’s wrong?

I lean to­ward our in­tu­ition be­ing wrong, be­cause it de­pends on a lot of ir­rel­levent stuff like whether the util­ities are near-mode (drown­ing child, microwaved ard drive), or far-mode (+-EV sin­gu­lar­ity, boltz­mann brain). Also, all the easy ways to make the de­ci­sion the­o­rist wrong don’t work un­less you are ~100% sure that un­bounded ex­pected util­ity max­i­miz­ing is stupid.

On the other hand, I’m still not about to start pay­ing out on pas­cals wa­ger.

Now con­sider that you ob­serve some re­al­ity-glitch that causes you to con­clude that you are quite sure that you are a boltz­mann brain. At the same time, you see a child that is drown­ing. Do you think a happy thought (best idea if you are a boltz­mann brain, IMO), or move quick to save the child (much bet­ter good idea, but only in the case where the child is real)? I would still try to save the child (I hope), as would an ideal de­ci­sion the­o­rist.

In this ex­am­ple, I very much agree, but not for any mag­i­cal sen­ti­ment or urge. I sim­ply don’t trust my brain and my own abil­ity to gather knowl­edge and in­fer /​ deduct the right things enough to over­ride the high base rate that there is an ac­tual child drown­ing that I should go save. It would take waaaaaay more con­clu­sive ev­i­dence and ex­per­i­men­ta­tion to con­firm the boltz­mann brain hy­poth­e­sis, and then some more to make sure the drown­ing child is ac­tu­ally such a phe­nomenon (did I get that right? I have no idea what a boltz­mann brain is).

Re­gard­ing the first ex­am­ple, that’s a very good case. I don’t see my­self fac­ing situ­a­tions that could be framed similarly very of­ten though, to be hon­est. In such a case, I would prob­a­bly do some­thing equiv­a­lent to the hard drive microwav­ing tac­tic, but would first at­tempt to run failsafes in case I’m the one see­ing the wrong thing—this is com­pa­rable to the in­junc­tion against tak­ing power in your own hands be­cause you’re ob­vi­ously able to make the best use of it. There are all kinds of rea­sons I might be wrong about the FAI, and might be do­ing some­thing wrong by microwav­ing the drive. Faced with a hard now-or-never set­ting with im­me­di­ate in­stant per­ma­nent con­se­quences, a clean two-path de­ci­sion tree (usu­ally as­tro­nom­i­cally un­likely in the real world, we just get the illu­sion that it is one), I would definitely take the microwave op­tion. In more likely sce­nar­ios though, there are all kinds of things to do.

Well, if I do have such ev­i­dence, then this is time for some bayes. If I’ve got the right math, then it’ll de­pend on in­for­ma­tion that I don’t have: What ac­tu­ally hap­pens if I’m a boltz­mann brain and I try to save the kid any­way?

The un­known in­for­ma­tion seems to be out­weighed by the clear +EV of sav­ing the child, but I have a hard time quan­tify­ing such un­known un­knowns even with WAGs, and my mas­tery of con­tin­u­ous prob­a­bil­ity dis­tri­bu­tions isn’t up to par to prop­erly calcu­late some­thing like this any­way.

In this case, my cu­ri­os­ity as for what might hap­pen is ac­tu­ally a +V, but even with­out that, I think I’d still try to save the child. My best guess is ba­si­cally “My built-in func­tion to eval­u­ate this says save the child, and this func­tion ap­par­ently knows more about the prob­lem than I do, and I have no valid math that says oth­er­wise, so let’s go with that” in such a case.

If you are a boltz­mann brain, none of this is real and you will blink out of ex­is­tence in the next sec­ond. If you think a happy thought, that’s a good thing. If you move to res­cue the child, you will be un­der stress and no child will end up be­ing res­cued.

If you don’t like the boltz­mann brain gam­ble, sub­sti­tute some­thing else where you have it on good au­thor­ity that noth­ing is real ex­cept your own hap­py­ness or what­ever.

(My an­swer is that the tiny pos­si­bil­ity that “none of this is real” is wrong is much more im­por­tant (in the sense that more value is at stake) than the main­line pos­si­bil­ity that none of this is real, so the main­line boltz­mann case more or less washes out in the noise and I act as if the things I see are real.)

EDIT: The cu­ri­os­ity thing is a fake jus­tifi­ca­tion: I find it sus­pi­cious that mov­ing to save the child also hap­pens to be teh most in­ter­est­ing ex­per­i­ment you could run.

The in­junc­tion “I can’t be in such an epistemic state, so I will go with the au­topi­lot” is a good solu­tion that I hadn’t though of. But then in the case of pure moral­ity, with­out epistemic con­cerns and what­not, which is bet­ter: save the very un­likely child, or think a happy thought? (my an­swer is above, but I still take the in­junc­tion in prac­tice)

Yes, I was aware the cu­ri­os­ity thing is not a valid rea­son, which is why I only qual­ify it as “+V”. There are other op­tions which give much greater +V. It is not an op­ti­mum.

Re­gard­ing you de­scrip­tion of the Vs, I guess I’m a bit skewed in that re­gard. I don’t per­ceive happy thought and stress/​sad­ness as clean-cut + and—util­ities. Ce­teris Paribus, I find stress to be pos­i­tive util­ity against the back­drop of “lack of any­thing”. I think there’s a Type 1 /​ Type 2 thing go­ing on, with the “con­scious” as­sign­ing some value to what’s au­to­matic or built-in, but I don’t re­mem­ber the right vo­cab­u­lary and recre­at­ing a proper ter­minol­ogy from re­duc­tions would take a lot of time bet­ter spent study­ing up on the already-es­tab­lished con­ven­tions. Ba­si­cally, I con­sciously value all feel­ings equiv­a­lently, with a built-in val­u­a­tion of what my in­stinct /​ hu­man-built-in-de­vices val­ues too, such that many small forms of pain are ac­tu­ally more pleas­ant than not feel­ing any­thing in par­tic­u­lar, but strong pain is less pleas­ant than tem­po­rary lack of feel­ing.

Stuck in a two-branch de­ci­sion-the­o­retic prob­lem be­tween “lifelong tor­ture” and “lifelong lack of sen­sa­tion or feel­ing”, my cur­rent con­scious mind is edg­ing to­wards the former, as­sum­ing the lat­ter means I don’t get that rush from cu­ri­os­ity and figur­ing stuff out any­more. Of course, in prac­tice I’m not quite so sure that none of the built-in mechanisms I have in my brain would get me to choose oth­er­wise.

Any­way, just wanted to chip in that the util­i­tar­ian math for the “if I’m a boltz­mann, I want a happy thought rather than a bit of stress” case isn’t quite so clear-cut for me per­son­ally, since the happy thought might not “suc­ceed” in be­ing pro­duced or be­ing re­ally happy, and the stress might be val­ued pos­i­tively any­way and is prob­a­bly more likely to “suc­ceed”. This isn’t the real mo­ti­va­tion for my choices (so it’s an ex­cuse/​ra­tio­nal­iza­tion if I de­cide based on this), but is an in­ter­est­ing bit of de­tail and trivia, IMO.

Well, if I have ev­i­dence that I’m a spe­cial kind of telekinetic who can only move stuff with his mind when not phys­i­cally mov­ing (i.e. not send­ing sig­nals to my own mus­cles) in­stead of a boltz­mann, then un­less I’m miss­ing some­thing I re­ally do pre­fer stay­ing im­mo­bile and sav­ing the child with my thoughts in­stead of jump­ing in and wast­ing a lot of en­ergy (this is as­sum­ing there’s no long-term con­se­quences like other peo­ple see­ing me save a child with my mind), but I’d still jump in any­way be­cause my men­tal ma­chin­ery over­rides the far knowl­edge that I can al­most cer­tainly do it with­out mov­ing.

It would take a lot of ac­tual train­ing in or­der to over­come this and start ac­tu­ally us­ing the telekine­sis. I think in such a situ­a­tion, an ideal ra­tio­nal­ist would use telekine­sis in­stead of jump­ing in the wa­ter—not to men­tion the prac­ti­cal ad­van­tages of sav­ing the child faster and in a safer man­ner (also with no risk to your­self!), as­sum­ing you have that level of con­trol over your telekinetic pow­ers.

Well, to make it fit Pas­cal’s Wager pat­tern a bit more, as­sume that you’re aware that telekinet­ics like you some­times have a finite, very small amount of phys­i­cal en­ergy you can spend dur­ing your en­tire life, and once you’re out of it you die. You have un­limited “telekinetic en­ergy”. Sav­ing the child is, if this is true, go­ing to chop off a good 95% of your re­main­ing lifes­pan and per­ma­nently sac­ri­fice any pos­si­bil­ity of be­com­ing im­mor­tal.

If you move to res­cue the child, you will be un­der stress and no child will end up be­ing res­cued.

Boltz­mann brains aren’t ac­tu­ally able to put them­selves un­der stress, any more than they can res­cue chil­dren or even think.

Aside from this, I’m not sure I ac­cept the as­sump­tion that I should care about the emo­tional ex­pe­riences of boltz­mann brains (or rep­re­sen­ta­tion of there be­ing such ex­pe­riences). That is, I be­lieve I re­ject:

If you are a boltz­mann brain, none of this is real and you will blink out of ex­is­tence in the next sec­ond. If you think a happy thought, that’s a good thing.

For the pur­pose of choos­ing my de­ci­sions and de­ci­sion mak­ing strat­egy for the pur­pose of op­ti­miz­ing the uni­verse to­wards a preferred state I would weigh in­fluence over the freaky low en­tropy part of the uni­verse (ie. what we be­lieve ex­ists) more than in­fluence over the ridicu­lous amounts of noise that hap­pens to in­clude boltz­mann brains of ev­ery kind even if my de­ci­sions had any in­fluence over the lat­ter at all.

There is a caveat that the above would be differ­ent if I was able to colonize and ex­ploit the high en­tropy parts of the uni­verse some­how but even then it wouldn’t be the noise-in­clud­ing-boltz­mann brains that I val­ued but what­ever lit­tle ne­gen­tropy that re­mained to be har­vested. If I hap­pened to seek out and find copies of my­self within the ran­dom fluc­tu­a­tions and pre­serve them then I would con­sider what I am do­ing to be roughly speak­ing cre­at­ing clones of my­self via a rather ec­cen­tric and in­effi­cient en­g­ineer­ing pro­cess in­volv­ing ‘search for state match­ing speci­fi­ca­tion then re­move ev­ery­thing else’ rather than ‘put stuff into state match­ing speci­fi­ca­tion’.

You’re right, an ac­tual boltz­mann brain would not have time to do ei­ther. It was just an illus­tra­tive ex­am­ple to get you to think of some­thing like pas­cals wa­ger with in­verted near-mode and far-mode.

If you don’t like the boltz­mann brain gam­ble, sub­sti­tute some­thing else where you have it on good au­thor­ity that noth­ing is real ex­cept your own hap­py­ness or what­ever.

It was just an illus­tra­tive ex­am­ple to get you to think of some­thing like pas­cals wa­ger with in­verted near-mode and far-mode.

It was mainly the Bolz­mann Brain com­po­nent that caught my at­ten­tion. Largely be­cause yes­ter­day I was con­sid­er­ing how the con­cept of “Boltz­mann’s Mar­bles” im­pacts on when and whether there was a time that could make the state­ment “There was only one mar­ble in the uni­verse” true.

You still haven’t ac­tu­ally calcu­lated the di­su­til­ity of hav­ing a policy of giv­ing the money, ver­sus a policy of not giv­ing the money. You’re just wav­ing your hands. Say­ing “the EV clearly seems bet­ter” is no more helpful than your ini­tial “ob­vi­ous”.

The calcu­la­tion I had in mind was ba­si­cally that if those poli­cies re­ally do have those effects, then which one is su­pe­rior de­pends en­tirely on the ra­tio be­tween: 1) the differ­ence be­tween like­li­hoods of large calamity when you pay vs not pay and 2) the ac­tual in­crease in fre­quency of muggings

The math I have, the way I un­der­stand it, re­moves the ac­tual -EV of the mug­ging (keep­ing only the differ­ence) from the equa­tion and saves some di­su­til­ity calcu­la­tion. In my mind, you’d need some pretty crazy val­ues for the above ra­tio in or­der for the policy of ac­cept­ing Pas­cal Mug­gings to be worth­while, and my WAGs are at 2% for the first and about 1000% for the sec­ond, with a base rate of around 5 to­tal Mug­gings if you have a policy of deny­ing them.

I have a high con­fi­dence rat­ing for val­ues that stay within the ra­tio that makes the de­nial policy fa­vor­able, and I find the val­ues that would be re­quired for fa­vor­ing the ac­cep­tance policy highly un­likely with my pri­ors.

Apolo­gies if it seemed like I was blow­ing air. I ac­tu­ally did some stuff on pa­per, but post­ing it seemed ir­rele­vant when the vast ma­jor­ity of LW users ap­pear to have far bet­ter mas­tery of math­e­mat­ics and the abil­ity to do such calcu­la­tions far faster than I can. I thought I’d suffi­ciently re­stricted the space of pos­si­ble calcu­la­tions with my de­scrip­tion in the grand­par­ent.

I might still be com­pletely wrong though. My maths have er­rors a full 25% of the time un­til I’ve ac­tu­ally pro­grammed or tested them some­how, for av­er­age math prob­lems.

But how do you know if some­one wanted to up­vote your post for clev­er­ness, but didn’t want to ex­press the mes­sage that they were mugged suc­cess­fully? Upvot­ing cre­ates con­flict­ing mes­sages for that spe­cific com­ment.

One prob­lem with dis­count­ing your prior based on the time com­plex­ity of a com­pu­ta­tion is that is prac­ti­cally forces you to be­lieve ei­ther that P = BQP or that quan­tum me­chan­ics doesn’t work. If you dis­count based on space com­plex­ity, you might worry that tor­tur­ing 3^^^3 peo­ple might ac­tu­ally be a small-space com­pu­ta­tion.

First¸ I didn’t read all of the above com­ments, though I read a large part of it.

Re­gard­ing the in­tu­ition that makes one ques­tion Pas­cals mug­ging: I think it would be likely that there was a strong sur­vival value in the an­ces­tral en­vi­ron­ment to be­ing able to de­tect and dis­re­gard state­ments that would cause you to pay money to some­one else with­out there be­ing any way to de­tect if these state­ments were true. Any­one with­out that abil­ity would have been mugged to ex­tinc­tion long ago. This makes more sense if we re­gard the ori­gin of our builtin util­ity func­tion as a /​very/​ coarse ap­prox­i­ma­tion of our genes’ sur­vival fit­ness.

Re­gard­ing what the FAI is to do, I think the mis­take made is as­sum­ing that the prior util­ity of do­ing rit­ual X is ex­actly zero, so that a very small change in our prob­a­bil­ities would make the ex­pected util­ity of X pos­i­tive. (Where X is “give the Pas­cal mug­ger the money”).
A suffi­ciently smart FAI would have thought about the pos­si­bil­ity of be­ing Pas­cal-mugged long be­fore that ac­tu­ally hap­pens, and would in fact con­sider it a likely event to some­times hap­pen. I am not say­ing that this ac­tu­ally hap­pen­ing is not a tiny sliver of ev­i­dence in fa­vor of the mug­ger tel­ling the truth, but it is very tiny. The FAI would (as­sum­ing it had enough re­sources) com­pute for ev­ery pos­si­ble Ma­trix sce­nario the ap­pro­pri­ate prob­a­bil­ities and util­ities for ev­ery pos­si­ble ac­tion, tak­ing the sce­nario’s com­plex­ity into ac­count. There is no rea­son to as­sume the prior ex­pected util­ity for any re­li­gious rit­ual (such as pay­ing Pas­cal mug­gers, whose state­ments you can’t check) is ex­actly zero. Maybe the FAI finds that there is a suffi­ciently sim­ple sce­nario in which a god ex­ists and in which it is ex­tremely util­lious to wor­ship that god, more so than any al­ter­na­tive sce­nar­ios. Or in which one should give in to (spe­cific forms of) Pas­cal mug­ging.

How­ever, the prob­lem as pre­sented in this blog­post im­plic­itly as­sumes that the prior prob­a­bil­ities the FAI holds are such that the tiny sliver of prob­a­bil­ity pro­vided by one more in­stance of Pas­cal’s mug­ging hap­pen­ing, is enough to push the prob­a­bil­ity of the sce­nario of ‘Ex­tra-Ma­trix de­ity kil­ling lots of peo­ple if I don’t pay’ over that of ‘Ex­tra-Ma­trix de­ity kil­ling lots of peo­ple if I do pay’. Since these two sce­nar­ios need not have the ex­act same Kol­mogorov com­plex­ity this is un­likely.

In short, ei­ther the FAI is already re­li­gious, (which may in­clude as a rit­ual ‘give money to peo­ple who speak a cer­tain passphrase’) or it is not, but the event of a Pas­cal mug­ging hap­pen­ing is un­likely to change its be­liefs.

Now, the ques­tion be­comes if we should ac­cept the FAI do­ing things that are ex­pected to fa­vor a huge num­ber of ex­tra-ma­trix peo­ple at a cost to a smaller num­ber of in­side-ma­trix peo­ple. If we ac­tu­ally count ev­ery hu­man life as equal, and we ac­cept what Solomonoff-in­ducted bayesian prob­a­bil­ity the­ory has to say about huge pay­off-tiny prob­a­bil­ity events and dutch books, the FAI’s choice of re­li­gion would be the ra­tio­nal thing to do. Else, we could add a term to the AI’s util­ity func­tion to fa­vor in­side-ma­trix peo­ple over out­side-ma­trix peo­ple, or we could make it fa­vor cer­tainty (of benefit­ting peo­ple known to ac­tu­ally ex­ist) over un­cer­tainty (of out­side-ma­trix peo­ple not known to ac­tu­ally ex­ist).

This might be overly sim­plis­tic, but it seems rele­vant to con­sider the prob­a­bil­ity per mur­der. I am feel­ing a bit of scope in­sen­si­tivity on that par­tic­u­lar prob­a­bil­ity, as it is far too small for me to com­pute, so I need to go through the steps.

If some­one tells me that they are go­ing to mur­der one per­son if I don’t give them $5, I have to con­sider the prob­a­bil­ity of it: not ev­ery at­tempted mur­der is suc­cess­ful, af­ter all, and I don’t have nearly as much in­cen­tive to pay some­one if I be­lieve they won’t be suc­cess­ful. Fur­ther, most peo­ple don’t ac­tu­ally at­tempt mur­der, and the cost to that per­son of tel­ling me they will mur­der some­one if they don’t get $5 is much, much smaller then the cost of ac­tu­ally mur­der­ing some­one. Con­se­quences usu­ally fol­low from mur­der, af­ter all. I also have to con­sider the prob­a­bil­ity that this per­son is in­sane and doesn’t care about the con­se­quences: only the $5.

Still, only .00496% of peo­ple are mur­dered in a year. (Ac­cord­ing to Wolfram Alpha, at least) And while I would as­sign a higher prob­a­bil­ity to a per­son claiming to mur­der some­one, it wouldn’t jump dra­mat­i­cally- they could be ly­ing, they could try but fail, etc. Even if I let “I will kill some­one” be a 90% ac­cu­rate test with only a 10% false pos­i­tive rate- which I think is gen­er­ous in the case of $5 with no ad­di­tional ev­i­dence- as only be­ing .004%. Even if it was 99% sure and 1% false pos­i­tive, EXTREMELY gen­er­ous odds, there is only a .4% to­tal prob­a­bil­ity of it oc­cur­ring.

In re­al­ity, I think there would be some ev­i­dence in the case of one mur­der. At very least I could get strong so­ciolog­i­cal cues that the per­son was likely to be tel­ling the truth. How­ever, since I am mov­ing to an end point where they will be kil­ling 3^^^^3 peo­ple, I’ll leave that aside as it is ir­rele­vant to the end ex­am­ple.

If such a per­son claimed they would mur­der 2 peo­ple, it would de­pend on whether I thought the prob­a­bil­ities of the events oc­cur­ring to­gether were de­pen­dent or in­de­pen­dent: if him kil­ling one per­son made it more likely that he would kill two, given the event (the threat) in ques­tion.

Now, if he says he will kill two peo­ple, and he kills one, he is un­likely to stop be­fore kil­ling an­other. BUT, there are more chances for com­pli­ca­tion or failure, and the cost:benefit for him shrinks by half, mak­ing the prob­a­bil­ity that he man­ages to or tries to kill any­one smaller. Th­ese num­bers in re­al­ity would be af­fected by cir­cum­stance: it is a lot eas­ier to kill two peo­ple with a pis­tol or a bomb than it is with your bare hands. But since I see no bomb or pis­tol and he is claiming some mechanism I have no ev­i­dence for, we’ll ig­nore that re­al­ity for now.

I had trou­ble find­ing in­for­ma­tion on the rate of dou­ble homi­cide:sin­gle homi­cide to use as a baseline, but it seems likely that it is nei­ther to­tally de­pen­dent, nor to­tally in­de­pen­dent. In or­der to be­lieve the threat cred­ible, I have to be­lieve (af­ter hear­ing the threat) that they will at­tempt to kill two peo­ple, suc­cess­fully kill one, AND suc­cess­fully kill an­other. And if I put the prob­a­bil­ity of A+B at .004%, I can’t very well put A+B+C at any higher. Since I used a 90% false pos­i­tive rate for my ini­tial calcu­la­tion, let’s use it twice: 81% false pos­i­tive. We’ll as­sume that the false nega­tive (he mur­ders peo­ple even when he says he won’t) stays con­stant.

This means that each mur­der is slightly more likely than 90% as likely to oc­cur as the mur­der be­fore it. Now, it isn’t ex­act, and these num­bers get re­ally, re­ally small, so I’m look­ing at 3^3 as a refer­ence.

At 3^3, the cost has gone up 27x if he kills peo­ple, but the prob­a­bil­ity of the event has gone down to .06 of what it was. So, some­thing like 1.7x more costly, given what was said above.

But all this was de­pen­dent on sev­eral as­sumed figures. So at what points does it bal­ance out?

I’m a lit­tle tired for do­ing all the math right now, but some quick work showed that be­ing only 80% sure of the test, with a 10% false pos­i­tive rate, would be enough to where it would go down con­tin­u­ously. So if I am less than 80% sure of the test of “he says he will mur­der one per­son if I don’t give him 5 dol­lars” then I can be sure that the prob­a­bil­ity that he will kill 3^^^^3 is far, far less than the cost if I am wrong.

I’m as­sum­ing that I am get­ting my math right here, and I am quite tired, so if any­one wishes to cor­rect me on some por­tion of this I would be happy for the crit­i­cism.

It does seem that the prob­a­bil­ity of some­one be­ing able to bring about the deaths of N peo­ple should scale as 1/​N, or at least 1/​f(N) for some mono­ton­i­cally in­creas­ing func­tion f. 3^^^^3 may be a more sim­ply speci­fied num­ber than 1697, but it seems “in­tu­itively ob­vi­ous” (as much as that means any­thing) that it’s eas­ier to kill 1697 peo­ple than 3^^^^3. Un­der this rea­son­ing, the likely deaths caused by not giv­ing the mug­ger $5 are some­thing like N/​f(N), which de­pends on what f is, but it seems likely that it con­verges to zero as N in­creases.

It is an awfully difficult ques­tion, though, be­cause how do we know we don’t live in a world where 3^^^^3 peo­ple could die at any mo­ment? It seems un­likely, but then so do a lot of things that are real.

Per­haps the prob­lem lies in the idea that a Tur­ing ma­chine can cre­ate en­tities that have the moral sta­tus of hu­mans. If there’s a ma­chine out there that can cre­ate and de­stroy 3^^^^3 hu­mans on a whim, then are hu­man lives re­ally worth that much? But, on the other hand, there are laws of physics out there that have been demon­strated to cre­ate al­most 3^^3 hu­mans, so what is one hu­man life worth on that scale?

On an­other note, my girlfriend says that if some­one tried this on her, she’d prob­a­bly give them the $5 just for the laugh she got out of it. It would prob­a­bly only work once, though.

I think you’ve just perfectly illus­trated how some Scope Insen­si­tivity can be good thing.

Be­cause a mind with perfect scope sen­si­tivity, will be di­verted into chas­ing im­pos­si­bly tiny prob­a­bil­ities for im­pos­si­bly large re­wards. If a good ra­tio­nal­ist must win, then a good ra­tio­nal­ist should com­mit to avoid­ing sup­posed ra­tio­nal­ity that makes him lose like that.

So, here’s a solu­tion. If a prob­a­bil­ity is too tiny to be rea­son­ably likely to oc­cur in your lifes­pan, treat its bait as ac­tu­ally im­pos­si­ble. If you don’t, you’ll in­evitably crash into effec­tive in­effec­tive­ness.

I think this sce­nario is in­ge­nious. Here are a few ideas, but I’m re­ally not sure how far one can pur­sue them /​ how ‘much work’ they can do:

(1) Per­haps the agent needs some way of ‘ab­solv­ing it­self of re­spon­si­bil­ity’ for the evil/​ar­bi­trary/​un­rea­son­able ac­tions of an­other be­ing. The ac­tion to be performed is the one that yields high­est ex­pected util­ity but only along causal path­ways that don’t go through an ad­ver­sary that has been la­bel­led as ‘un­rea­son­able’.

(Ex­cept this ap­proach doesn’t de­fuse the vari­a­tion that goes “You can never wipe your nose be­cause you’ve com­puted that the prob­a­bil­ity of this ac­tion kil­ling 3^^^^3 peo­ple in a par­allel uni­verse is ever so slightly greater than the prob­a­bil­ity of it sav­ing that num­ber of peo­ple”.)

(2) We only have a fixed amount of ‘moral con­cern’, ap­por­tioned some­how or other to the be­ings we care about. Our util­ity func­tion looks like: Sum(over be­ings X) Con­cernFor(X)*Hap­pinessOf(X). Allo­ca­tion of ‘moral con­cern’ is a ‘com­pet­i­tive’ pro­cess. The only way we can gain some con­cern about Y is to lose a bit of con­cern about some X, but if we have reg­u­lar and in some sense ‘pos­i­tive’ in­ter­ac­tions with X then our con­cern for X will be con­stantly ‘re­plen­ish­ing it­self’. When the ma­gi­cian ap­pears and tells us his story, we may ac­quire a tiny bit of con­cern about him and the peo­ple he men­tions, but the parts of us that care about the peo­ple we know (a) aren’t ‘told’ the ma­gi­cian’s story and thus (b) re­fuse to ‘re­lin­quish’ very much.

The trou­ble with is that it sounds too rem­i­nis­cent of the in­sanely stupid moral be­havi­our of hu­man be­ings (where e.g. they give ex­actly as much money to save a hun­dred pen­guins as ten thou­sand.)

(3) We com­pletely aban­don the prin­ci­ple of us­ing min­i­mum de­scrip­tion length as some kind of ‘uni­ver­sal prior’. (For some rea­son. And re­place it with some­thing else. For some rea­son.)

Re­gard­ing the com­ments about ex­plod­ing brains, it’s a won­der to me that we are able to think about these is­sues and not lose our san­ity. How is it that a brain evolved for hunt­ing/​gath­er­ing/​so­cial­iz­ing is able to con­sider these prob­lems at all? Not only that, but we seem to have some use­ful in­tu­itions about these prob­lems. Where on Earth did they come from?

Nick> Does your pro­posal re­quire that one ac­cepts the SIA?

Yes, but us­ing a com­plex­ity-based mea­sure as the an­thropic prob­a­bil­ity mea­sure im­plies that the SIA’s effect is limited. For ex­am­ple, con­sider two uni­verses, the first with 1 ob­server, and the sec­ond with 2. If all of the ob­servers have the same com­plex­ity you’d as­sign a higher prior prob­a­bil­ity (i.e., 2⁄3) to be­ing in the sec­ond uni­verse. But if the sec­ond uni­verse has an in­finite num­ber of ob­servers, the sum of their mea­sures can’t ex­ceed the mea­sure of the uni­verse as a whole, so the “pre­sump­tu­ous philoso­pher” prob­lem is not too bad.

Nick> If I un­der­stand your sug­ges­tion cor­rectly, you pro­pose that the same an­thropic prob­a­bil­ity mea­sure should also be used as a mea­sure of moral im­por­tance.

Yes, in fact I think there are good ar­gu­ments for this. If you have an an­thropic prob­a­bil­ity mea­sure, you can ar­gue that it should be used as the mea­sure of moral im­por­tance, since ev­ery­one would pre­fer that was the case from be­hind the veil of ig­no­rance. On the other hand, if you have a mea­sure of moral im­por­tance, you can ar­gue that for de­ci­sions not in­volv­ing ex­ter­nal­ities, the global best case can be ob­tained if peo­ple use that mea­sure as the an­thropic prob­a­bil­ity mea­sure and just con­sider their self in­ter­ests.

BTW, when us­ing both an­thropic rea­son­ing and moral dis­count­ing, it’s easy to ac­ci­den­tally ap­ply the same mea­sure twice. For ex­am­ple, sup­pose the two uni­verses both have 1 ob­server each, but the ob­server in the sec­ond uni­verse has twice the mea­sure of the one in the first uni­verse. If you’re asked to guess which uni­verse you’re in with some pay­off if you guess right, you don’t want to think “There’s 2⁄3 prob­a­bil­ity that I’m in the sec­ond uni­verse, and the pay­off is twice as im­por­tant if I guess ‘sec­ond’, so the ex­pected util­ity of guess­ing ‘sec­ond’ is 4 times as much as the EU of guess­ing ‘first’.”

I think that to avoid this kind of con­fu­sion and other an­thropic rea­son­ing para­doxes (see http://​​groups.google.com/​​group/​​ev­ery­thing-list/​​browse_frm/​​thread/​​dd21cbec7063215b/​​), it’s best to con­sider all de­ci­sions and choices from a mul­ti­ver­sal ob­jec­tive-de­ter­minis­tic point of view. That is, when you make a de­ci­sion be­tween choices A and B, you should think “would I pre­fer if ev­ery­one in my po­si­tion (i.e., hav­ing the same per­cep­tions and mem­o­ries as me) in the en­tire mul­ti­verse chose A or B?” and ig­nore the temp­ta­tion to ask “which uni­verse am I likely to be in?”.

But that may not work un­less you be­lieve in a Teg­markian mul­ti­verse. If you don’t, you may have to use both an­thropic rea­son­ing and moral dis­count­ing, be­ing very care­ful not to dou­ble-count.

Wei, no I don’t think I con­sid­ered the pos­si­bil­ity of dis­count­ing peo­ple by their al­gorith­mic com­plex­ity.

I can see that in the con­text of Everett it seems plau­si­ble to weigh each ob­server with a mea­sure pro­por­tional to the am­pli­tude squared of the branch of the wave func­tion on which he is liv­ing. More­over, it seems right to use this mea­sure both to calcu­late the an­thropic prob­a­bil­ity of me find­ing my­self as that ob­server and the moral im­por­tance of that ob­server’s well-be­ing.

As­sign­ing an­thropic prob­a­bil­ities over in­finite do­mains is prob­le­matic. I don’t know of a fully satis­fac­tory ex­pla­na­tion of how to do this. One nat­u­ral ap­proach might to ex­plore might be to as­sign some Tur­ing ma­chine based mea­sure to each of the in­finite ob­servers. Per­haps we could as­sign plau­si­ble prob­a­bil­ities by us­ing such an ap­proach (al­though I’d like to see this worked out in de­tail be­fore ac­cept­ing that it would work).

If I un­der­stand your sug­ges­tion cor­rectly, you pro­pose that the same an­thropic prob­a­bil­ity mea­sure should also be used as a mea­sure of moral im­por­tance. But there seems to me to be a prob­lem. Con­sider a sim­ple clas­si­cal uni­verse with two very similar ob­servers. On my reck­on­ing they should each get an­thropic prob­a­bil­ity mea­sure 1⁄2 (re­ject­ing SIA, the Self-Indi­ca­tion As­sump­tion). Yet it ap­pears that they should each have a moral weight of 1. Does your pro­posal re­quire that one ac­cepts the SIA? Or am I mis­in­ter­pret­ing you? Or are you try­ing to ex­pli­cate not to­tal util­i­tar­i­anism but av­er­age util­i­tar­i­anism?

Even if the Ma­trix-claimant says that the 3^^^^3 minds cre­ated will be un­like you, with in­for­ma­tion that tells them they’re pow­er­less, if you’re in a gen­er­al­ized sce­nario where any­one has and uses that kind of power, the vast ma­jor­ity of mind-in­stan­ti­a­tions are in leaves rather than roots.

You would have to aban­don Solomonoff In­duc­tion (or mod­ify it to ac­count for these an­thropic con­cerns) to make this work. Solomonoff In­duc­tion doesn’t let you con­sider just “gen­er­al­ized sce­nar­ios”; you have to calcu­late each one in turn, and even­tu­ally one of them is guaran­teed to be nasty.

To para­phrase Wei’s ex­am­ple: the mug­ger says, “Give me five dol­lars, or I’ll simu­late and kill 3^^^^3 peo­ple, and I’ll make sure they’re aware that they are at the leaf and not at the node”. Con­grat­u­la­tions, you now have over 3^^^^3 bits of ev­i­dence (in fact, it’s a tau­tol­ogy with prob­a­bil­ity 1) that the fol­low­ing propo­si­tion is true: “if the mug­ger’s state­ment is cor­rect, then I am the one per­son at the node and am not one of the 3^^^^3 peo­ple at the leaf.” By Solomonoff In­duc­tion, this sce­nario where his state­ment is liter­ally true has > 1 /​ 2^(10^50) prob­a­bil­ity, as it’s eas­ily de­scrib­able in much less than 10^50 bits. Once you try to eval­u­ate the util­ity differ­en­tial of that sce­nario, boom, we’re right back where we started.

On the other hand, you could mod­ify Solomonoff In­duc­tion to re­flect an­thropic con­cerns, but I’m not sure it’s any bet­ter than just mod­ify­ing the util­ity func­tion to re­flect an­thropic con­cerns.

Eliezer, what if the mug­ger (Ma­trix-claimant) also says that he is the only per­son who has that kind of power, and he knows there is just one copy of you in the whole uni­verse? Is the prob­a­bil­ity of that be­ing true less than 1/​3^^^^3?

Robin’s an­thropic ar­gu­ment seems pretty com­pel­ling in this ex­am­ple, now that I un­der­stand it. It seems a lit­tle less clear if the Ma­trix-claimant tried to mug you with a threat not in­volv­ing many minds. For ex­am­ple, maybe he could claim that there ex­ists some gi­ant mind, the kil­ling of which would be as eth­i­cally sig­nifi­cant as the kil­ling of 3^^^^3 in­di­vi­d­ual hu­man minds? Maybe in that case you would an­throp­i­cally ex­pect with over­whelm­ingly high prob­a­bil­ity to be a fig­ment in­side the gi­ant mind.

Maybe the ori­gin of the para­dox is that we are ex­tend­ing the prin­ci­ple of max­i­miz­ing ex­pected re­turn be­yond its do­main of ap­pli­ca­bil­ity. Un­like Bayes for­mula, which is an unas­sailable the­o­rem, the prin­ci­ple of max­i­miz­ing ex­pected re­turn is per­haps just a model of ra­tio­nal de­sire. As such it could be wrong. When deal­ing with rea­son­ably high prob­a­bil­ities, the model seems in­tu­itively right. With small prob­a­bil­ities it seems to be just an ab­strac­tion, and there is not much in­tu­ition to com­pare it to. When con­sid­er­ing a game with pos­i­tive ex­pected re­turn that comes from big pay­offs and small prob­a­bil­ities, it re­duces to the in­tu­itive case if we have the op­por­tu­nity to play the game many times, on the or­der of one over the pay­off prob­a­bil­ity. This type of fre­quen­tist ar­gu­ment seems to be where the prin­ci­ple comes from in the first place. How­ever, if the prob­a­bil­ities are so small that there is no pos­si­bil­ity of play­ing the game that many times, then maybe a ra­tio­nal per­son just ig­nores it rather than du­tifully in­vest­ing in an es­sen­tially cer­tain loss. Of course, if we rel­e­gate the prin­ci­ple of max­i­miz­ing ex­pected re­turn to be­ing just a limit­ing case, this leaves open the ques­tion of what more gen­eral model un­der­lies it.

Eliezer> It’s hard to see why I would con­sider this the right thing to do—where does this mys­te­ri­ous “mea­sure” come from?

Sup­pose you plan to mea­sure the po­lariza­tion of a pho­ton at some fu­ture time and thereby split the uni­verse into two branches of un­equal weight. You do not treat peo­ple in these two branches as equals, but in­stead value the peo­ple in the higher-weight branch more, right? Can you an­swer why you con­sider that to be the right thing to do? That’s not a rhetor­i­cal ques­tion, btw. If I knew the an­swer to that ques­tion I think I’d also know why dis­count­ing peo­ple by al­gorith­mic com­plex­ity (or some other func­tion) might be the right thing to do.

Stephen> Men­tion­ing quan­tum me­chan­ics serves only as a dis­trac­tion.

In clas­si­cal physics, the uni­verse doesn’t branch, but in­stead ev­ery­thing is pre­de­ter­mined by the start­ing con­di­tions and laws of physics. There is no is­sue of peo­ple in un­equal-weight branches, which I think might be analo­gous to peo­ple with differ­ent al­gorith­mic com­plex­ities. That’s why I brought up QM.

Isn’t the point es­sen­tially that we be­lieve the man’s state­ment is un­cor­re­lated with any moral facts? I mean if we did, then its pretty clear we can be morally forced into do­ing some­thing.

Is it rea­son­able to be­lieve the state­ment is un­cor­re­lated with any facts about the ex­is­tence of many lives? It seems so, since we have no sub­stan­tial ex­pe­rience with “Ma­tri­ces”, peo­ple from out­side the simu­la­tion vist­ing us, 3^^^^^^3, the simu­la­tion of moral per­sons, etc…

Con­sider, the state­ment ‘there is a woman be­ing raped around the cor­ner’. We are morally obliged to look around the cor­ner. We have no more di­rect proof of the truth of this state­ment than of Pas­cal’s mug­ger’s state­ment. But we have good rea­son to be­lieve the state­ment is cor­re­lated with a fact in one case, but no such rea­son in the other.

Can a ma­chine be made that will con­sis­tently give zero cor­re­la­tion to this sort of thing? Hell if I know. Prob­a­bly no, since you if iter­ate the known enough you get the ab­surd. But any claim that con­di­tional prob­a­bil­ity of 3^^^^3 lives be­ing simu­lated and de­stroyed is 1/​(3^^^^^3) or some­thing is a pile of horse­shit.

Vann McGee has proven that if you have an agent with an un­bounded util­ity func­tion and who thinks there are in­finitely many pos­si­ble states of the world (ie, as­signs them prob­a­bil­ity greater than 0), then you can con­struct a Dutch book against that agent. Next, ob­serve that any­one who wants to use Solomonoff in­duc­tion as a guide has com­mit­ted to in­finitely many pos­si­ble states of the world. So if you also want to ad­mit un­bounded util­ity func­tions, you have to ac­cept ra­tio­nal agents who will buy a Dutch book.

And if you do that, then the sub­jec­tivist jus­tifi­ca­tion of prob­a­bil­ity the­ory col­lapses, tak­ing Bayesi­anism with it, since that’s based on non-Dutch-book-abil­ity.

I think the clean­est op­tion is to drop un­bounded util­ity func­tions, since they buy you zero ad­di­tional ex­pres­sive power. Sup­pose you have an event space S, a prefer­ence re­la­tion P, and a util­ity func­tion f from events to non­nega­tive real num­bers such that if s1 P s2, then f(s1) < f(s2). Then, you can eas­ily turn this into a bounded util­ity func­tion g(s) = f(s)/​(f(s) + 1). It’s eas­ily seen that g re­spects the prefer­ence re­la­tion P in ex­actly the same way as f did, but is now bounded to the in­ter­val [0, 1).

“Odd, I’ve been read­ing moral para­doxes for many years and my brain never crashed once, nor have I turned evil.”

Even if it hasn’t hap­pened to you, it’s quite com­mon- think about how many peo­ple un­der Stalin had their brains pro­grammed to mur­der and tor­ture. Look­ing back and see­ing how your brain could have crashed is scary, be­cause it isn’t par­tic­u­larly im­prob­a­ble; it al­most hap­pened to me, more than once.

That’s a re­mark­able level of re­silience for a brain de­sign which is, speak­ing pro­fes­sion­ally, a damn ugly mess.

...with vi­tal func­tions in­her­ited from rep­tiles. But it’s been tested to death through his­tory, se­ri­ous failures thrown out at each step, and we’ve lots of prac­ti­cal ex­pe­rience and knowl­edge about how and why it fails. It wasn’t built and run first go with zero un­re­cov­er­able er­rors.

I’m not ad­vo­cat­ing us­ing evolu­tion­ary al­gorithms or to model from the hu­man brain like Ray Kurzweil. I just mean I’d al­low for un­ex­pected break­downs in any part of the sys­tem, how­ever much you trust it. At least enough so if it fails it fails safe.

That’s only my opinion, and it shouldn’t be taken too se­ri­ously as I don’t have much knowl­edge in the field at this time, but I thought I should ex­plain what I meant.

Rolf: I agree with ev­ery­thing you just said, es­pe­cially the bit about patches and hacks. I just wouldn’t be happy hav­ing a FAI’s san­ity de­pen­dent on any sin­gle part of it’s de­sign, no mat­ter how perfect and el­e­gant look­ing, or prov­ably safe on pa­per, or demon­stra­bly safe in our ex­per­i­ments.

I gen­er­ally share Tom McCabe’s con­clu­sion, that is, that they ex­actly can­cel out be­cause a sym­me­try has not been bro­ken. The re­versed hy­poth­e­sis has the same com­plex­ity as the origi­nal hy­poth­e­sis, and the same ev­i­dence sup­port­ing it. No differ­en­tial en­tan­gle­ment. How­ever, I think that this prob­lem is worth at­ten­tion be­cause a) so many peo­ple who nor­mally agree dis­agree here, and b) I sus­pect that the prob­lem is re­lated to nor­mal util­i­tar­i­anism with no dis­count­ing and an un­bounded fu­ture. Of course, we already have some solu­tions in that case and we should try them and see what we get. Our re­al­is­tic AGI is bound­edly ra­tio­nal in some re­spect to an­other. How does it’s limi­ta­tion in pre­dict­ing the mun­dane con­se­quences of any given ac­tion re­late to its limi­ta­tion in pre­dict­ing the prob­a­bil­ities in Pas­cal’s Mug­ging.

“Let the differ­en­tial be nega­tive. Same prob­lem. If the differ­en­tial is not zero, the AI will ex­hibit un­rea­son­able be­hav­ior. If the AI liter­ally thinks in Solomonoff in­duc­tion (as I have de­scribed), it won’t want the differ­en­tial to be zero, it will just com­pute it.”

How can a com­pu­ta­tion ar­rive at a nonzero differ­en­tial, start­ing with zero data? If I ask a ra­tio­nal AI to calcu­late the prob­a­bil­ity of me typ­ing “QWERTYUIOP” sav­ing 3^^^^3 hu­man lives, it knows liter­ally noth­ing about the causal in­ter­ac­tions be­tween me and those lives, be­cause they are to­tally un­ob­serv­able.

Keep in mind that I have very limited knowl­edge of prob­a­bil­ity or an­a­lytic philos­o­phy, but wouldn’t a very easy an­swer be that if you can con­ceive of a sce­nario with the same out­come as­signed to NOT do­ing the ac­tion, and that sce­nario has an equal prob­a­bil­ity to be true they’re both ir­rele­vant?

If it’s pos­si­ble that you can get an in­finite amount of gain by be­liev­ing in god, it’s equally pos­si­ble you can get an in­finite amount of gain by NOT be­liev­ing in god.

“Give me five dol­lars, or I’ll use my magic pow­ers from out­side the Ma­trix to run a Tur­ing ma­chine that simu­lates and kills 3^^^^3 peo­ple.”

If there’s an ar­bi­trar­ily small prob­a­bil­ity that this state­ment is true, there’s an equal ar­bi­trar­ily small prob­a­bil­ity of the same re­sult if you do noth­ing.

So the prob­a­bil­ity of those deaths re­mains the same re­gard­less of your ac­tions. So the state­ment is mean­ingless.

Ob­vi­ously this doesn’t always ap­ply.

If this over­looks some­thing in­cred­ibly ob­vi­ous then I’m just a stupid teenager and please be kind.

Was Kant an an­a­lytic philoso­pher? I can’t re­mem­ber, but think­ing in terms of your ac­tions as be­ing the stan­dard for a “cat­e­gor­i­cal im­per­a­tive” fol­lowed by your­self in all situ­a­tions as well as by all moral be­ings, the effect of giv­ing the mug­ger the money is more than $5. If you give him the money once he’ll be able to keep on de­mand­ing it from you as well as from other ra­tio­nal­ists. Hence the effect will be not $5 but all of your (plu­ral) money, a harm which might be in a sig­nifi­cant enough ra­tio to the deaths of all those peo­ple to war­rant not giv­ing him the money.

I think we have as­sume that, al­though this sounds awfully like that quote about “a mil­lion deaths are a statis­tic”, the cost of ad­di­tional deaths de­creases. I’m not re­ally sure how to jus­tify that though.

I would protest that a pro­gram to run our known laws of physics (which only pre­dict that 10^80 atoms ex­ist...so there’s no way 3^^^^3 dis­tinct minds could ex­ist) is smaller by some num­ber of bits on the or­der of log_2(3^^^^3) than one in which I am seem­ingly run­ning on the known laws of physics, and my choice whether or not to hand over $5 dol­lars (to some­one act­ing as if they are run­ning on the known laws of physics...seem­ing gen­er­ally hu­man, and try­ing to gain wealth with­out do­ing hard work) is pos­i­tively cor­re­lated with whether or not 3^^^^3 minds run­ning on a non-phys­i­cal server some­where (yes, this situ­a­tion re­quires the pos­tu­la­tion of the su­per­nat­u­ral) are ex­tin­guished. The mug­ger is claiming to be God in the flesh. This prob­lem is the steel-manned Pas­cal’s Wager.

And you should not give the mug­ger the money, for the same rea­son you should not be­come a Chris­tian be­cause of Pas­cal’s wa­ger. Most of the (vast,vast) im­prob­a­bil­ity lies in the mug­ger’s claim to be God. The truth of eir claim is about as prob­a­ble as the truth of an idea that popped into a psy­chopath’s head that e needs to set eir wallet on fire this in­stant, in or­der to please the gods, or else they will tor­ture em for 3^^^^3 years. I claim that Eliezer vastly, vastly un­der­es­ti­mated the com­plex­ity of this situ­a­tion...which is already enough to solve the prob­lem. But, sup­pose, just sup­pose, this mug­ger re­ally is God. Then which God is e? For ev­ery pos­si­ble mind that does X in situ­a­tion A, there is a mind that does not-X in situ­a­tion A. We don’t in­ter­act with gods on a day-to-day ba­sis; we in­ter­act with hu­mans. We have no prior in­for­ma­tion about what this mug­ger will do in this situ­a­tion if we give em the money. E could be the “Pro­fes­sor Mug­ger” that pun­ishes you iff you give em the money, be­cause you acted in a way such that any per­son off the street could have come up to you and taken your money. E could just not do any­thing ei­ther way. You don’t know. Ig­no­rance prior, .5. I have no effect in this situ­a­tion; I do $5 bet­ter in the (ex­tremely, ex­tremely, more likely) situ­a­tion where this mug­ger is a con artist. No money for the mug­ger. There’s also the slight is­sue that I’m more likely to be mugged this way if I pre­com­mit to los­ing my money...so I’d be pretty stupid to do that.

The AI will shut up and mul­ti­ply if it’s pro­grammed prop­erly, and get the right an­swer, my an­swer, the an­swer that also hap­pens to be the one we in­stinc­tu­ally lean to­wards here. If we’re run­ning a hu­man-friendly dy­namic, there’s no need to worry about the AI mak­ing the wrong choice. Do we se­ri­ously think we could do bet­ter? If so, then why cre­ate the AI in the first place?

How do you figure? Since we’re not talk­ing about speed, the pro­gram seems to this lay­man like one a su­per-in­tel­li­gence could write while still on Earth (per­haps with difficulty). While the num­ber you just named, and even the range of num­bers if I take it that way, looks larger than the num­ber of atoms on Earth. The whole point is that you can de­scribe “3^^^^3” pretty sim­ply by com­par­i­son to the ac­tual num­ber.

For ev­ery pos­si­ble mind that does X in situ­a­tion A, there is a mind that does not-X in situ­a­tion A.

And some are more likely than oth­ers (in any co­her­ent way of rep­re­sent­ing what we know and don’t know), of­ten by slight amounts that mat­ter when you mul­ti­ply them by 3^^^3. Any­way, the prob­lem we face is not the origi­nal Mug­ger. The prob­lem is that ex­pected value for any given de­ci­sion may not con­verge if we have to think this way!

In ret­ro­spect, I think Eliezer should not have fo­cused on that as much as he did. Let’s cut to the core of the is­sue: How should an AI han­dle the prob­lem of mak­ing choices, which, maybe, just maybe, could have a huge, huge effect?

I think Eliezer over­looked the com­plex­ity in­her­ent in a mind...the com­plex­ity of the situ­a­tion isn’t in the num­ber; it’s in what the things be­ing num­bered are. To cre­ate 3^^^^3 dis­tinct,com­plex things that would be val­ued by a posthu­man would be an in­cred­ibly difficult, time-con­sum­ing task. Of course, at this mo­ment, the AI doesn’t care about do­ing that; it cares whether or not the uni­verse is already run­ning 3^^^^3 of these things. I do think a pro­gram to run these com­pu­ta­tions might be more com­plex than writ­ing a pro­gram to simu­late our physics, but step­ping back, it would not have to be any­where near log_2(3^^^^3) bits more com­plex. Really, re­ally bad case of scope in­sen­si­tivity on my part.

For ev­ery pos­si­ble mind that does X in situ­a­tion A, there is a mind that does not-X in situ­a­tion A.

My first com­ment was wrong. That ar­gu­ment should have been the pri­mary ar­gu­ment, and the other shouldn’t have been in there, at all...but let’s step back from Eliezer’s ex­act given situ­a­tion. This is a gen­eral prob­lem which ap­plies to, as far as I can see, pretty much any ac­tion an AI could take (see Tom_McCabe2′s “QWERTYUIOP” re­mark).

Let’s say the AI wants to save a drown­ing child. How­ever, the uni­verse hap­pens to care about this sin­gle mo­ment in time, and iff the AI saves the child, 3^^^^3 peo­ple will die in­stant in­stantly, and then the AI will be given in­for­ma­tion to ver­ify that this has oc­curred with high prob­a­bil­ity. One of the sim­plest ways for the uni­verse-pro­gram to im­ple­ment this is:

If (AI saves child), then re­set all bits in that con­stantly evolv­ing 3^^^^3-en­try long data struc­ture over there to zero, send proof to AI.
Else, pro­ceed nor­mally.

Note that this is magic. Magic is that which can­not be un­der­stood, that which cor­re­lates with no data other than it­self. The code could just as eas­ily be this:

If (AI saves child), then pro­ceed nor­mally.
Else, re­set all bits in that con­stantly evolv­ing 3^^^^3-en­try long data struc­ture over there to zero, send proof to AI.

Those two code seg­ments are equally com­pli­cated. The AI shouldn’t weight ei­ther higher than the other. For each small in­cre­ment in com­plex­ity to the “malev­olent” code you make from there, to have it carry out the same func­tion, I con­tend that you can make a cor­re­spond­ing in­cre­ment in the “benev­olent” code to do the same thing.

If our uni­verse was op­ti­mized to give us hope, and then thwart our val­ues, there’s noth­ing even an AI can do about that. An AI can only op­ti­mize that which it both un­der­stands, and is per­mit­ted to op­ti­mize by the uni­verse’s code. The uni­verse’s code could be such that it gives the AI false be­liefs about pretty much ev­ery­thing, and then the AI would be un­able to op­ti­mize any­thing.

If the “malev­olent” code runs, then the AI would make a HUGE up­date af­ter that, pos­si­bly choos­ing not to save any drown­ing chil­dren any­more (though that up­date would be wrong if the code were as above...overfit­ting). But it can’t up­date on the pos­si­bil­ity that it might up­date—that would vi­o­late con­ser­va­tion of ex­pected ev­i­dence. All dis­ease might mag­i­cally im­me­di­ately be cured if the AI saves the drow­ing child. I don’t see how this is any more com­plex.

some are more likely than others

So, this is what I con­test. If one was re­ally that much more likely, the AI would have already known about it (cf. what Eliezer says in “Tech­ni­cal Ex­pla­na­tion”: “How would I ex­plain the event of my left arm be­ing re­placed by a blue ten­ta­cle? The an­swer is that I wouldn’t. It isn’t go­ing to hap­pen....If I was wor­ried I might some­day need a clever ex­cuse for wak­ing up with a ten­ta­cle, the rea­son I was ner­vous about the pos­si­bil­ity would be my ex­pla­na­tion.”). An AI is de­signed to ac­com­plish this task as best as is pos­si­ble. I no­ticed my con­fu­sion when I re­called this pa­per refer­ring to AIXI I’d pre­vi­ously taken a short look at. The AI won on Par­tially Ob­serv­able Pac­man; it did much bet­ter than I could ever hope to do (if I were given the data in the form of pure nu­mer­i­cal re­ward sig­nals, writ­ten down on pa­per). It didn’t get stuck won­der­ing whether it would lose 2,000,000 points when the most it had ever lost be­fore was less than 100.

I know al­most noth­ing about AI. I don’t know the right way we should ap­prox­i­mate AIXI, and mod­ify it so that it knows it is a part of its en­vi­ron­ment. I do know enough about ra­tio­nal­ity from read­ing Less Wrong to know that we shouldn’t shut it off just be­cause it does some­thing coun­ter­in­tu­itive, if we did pro­gram it right. (And I hope to one day make both of the first two sen­tences in this para­graph false.)

Also note that, be­tween Teg­mark’s mul­ti­verses and the Many-Wor­lds Hy­poth­e­sis, there are many, many more than 10^80 atoms in ex­is­tence. 10^80 atoms are only the num­ber atoms we could ever see, as­sum­ing FTL is im­pos­si­ble.

A thought on this, and apolo­gies if it re­peats some­thing already said here. Ba­si­cally: ques­tion the struc­ture that leads to some­one say­ing this to you, and ques­tion how easy it is to talk about 3^^^^3 peo­ple as op­posed to, say, 100. If sud­denly said per­son man­i­fests Magic Ma­trix God Pow­ers (R) then the ev­i­dence gained by ob­serv­ing this or any­thing that con­tains it (they’re tel­ling the truth about all this, you have gone in­sane, aliens/​God/​Cthulhu is/​are caus­ing you to see this, this per­son re­ally did just paint mile-high let­ters in the sky and there is no Ma­trix) should be more than enough to tip the bal­ance of ev­i­dence in fa­vor of “yeah, holy crap, this per­son de­serves my $5!” In short—don’t even take se­ri­ously your model of an agent as be­ing truth­ful or un­truth­ful; it could be out­side-con­text-prob­lem wrong, es­pe­cially if it be­haves patholog­i­cally, be­ing overly sen­si­tive to small amounts of ev­i­dence for bad rea­sons. Similar idea to your whole model pos­si­bly just be­ing wrong if it re­ports 1-1/​3^^^3 cer­tainty.

Eliz­ier,
Th ra­tio­nal an­wser to Pas­cal’s mug­ging is to re­fuse, at­tempt to per­suade the mug­ger, and when that fails (which I pos­tu­late based on an eth­i­cal en­tity able to com­pre­hend 3^^^3 and an un­eth­i­cal en­tity will­ing to tor­ture that many) to ini­ti­ate con­flict.

The calcu­la­tional alge­bra of loss over prob­a­bil­ity has to be tem­pered by fu­ture pre­dic­tion:

What is the chance the mug­ger will do this again? If my only op­tions are to give 5 or not give 5 does it mean 3^^^^^3 will end up be­ing at risk as the mug­ger keeps do­ing this? How do I make it stop?

The re­spon­si­ble long term an­wser is: let the hostages die if needed to kill the ter­ror­ist, be­cause oth­er­wise you get more ter­ror­ists tak­ing hostages.

More for­mally, in a Bayesian set­ting, Sa­gan’s maxim can be con­strued as the re­quire­ment for the prior to be a non-heavy-tailed prob­a­bil­ity dis­tri­bu­tion.

In fact, in for­mal ap­pli­ca­tions of Bayesian meth­ods, typ­i­cal light-tailed max­i­mum en­tropy dis­tri­bu­tions such as nor­mal or ex­po­nen­tial are used.

Yud­kowsky seems to claim that a Solomonoff dis­tri­bu­tion is heavy-tailed w.r.t. the rele­vant vari­ables, but he doesn’t provide a proof of that claim, and in­deed the claim is even difficult to for­mal­ize prop­erly, since the Solomonoff in­duc­tion model has no ex­plicit no­tion of world state vari­ables, it just defines a prob­a­bil­ity dis­tri­bu­tion over ob­ser­va­tions.

Any­way, that’s an in­ter­est­ing ques­tion, and, if it turns out that the Solomonoff prior is in­deed heavy-tailed w.r.t. any rele­vant state vari­able, it would seem to me as a good rea­son not to use Solomonoff in­duc­tion.

IIUC, Yud­kowsky’s episte­mol­ogy is es­sen­tially that Solomonoff in­duc­tion is the ideal of un­bounded epistemic ra­tio­nal­ity that any bound­edly ra­tio­nal rea­soner should try to ap­prox­i­mate.

I con­test that Solomonoff in­duc­tion is the self-ev­i­dent ideal epistemic ra­tio­nal­ity.

What is the mere Earth at stake, com­pared to a tiny prob­a­bil­ity of 3^^^^3 lives?

Do you re­ally think this would be clearer or more rigor­ous if writ­ten in math­e­mat­i­cal no­ta­tion?

Any­way, that’s an in­ter­est­ing ques­tion, and, if it turns out that the Solomonoff prior is in­deed heavy-tailed w.r.t. any rele­vant state vari­able, it would seem to me as a good rea­son not to use Solomonoff in­duc­tion.

What is the mere Earth at stake, com­pared to a tiny prob­a­bil­ity of 3^^^^3 lives?
Do you re­ally think this would be clearer or more rigor­ous if writ­ten in math­e­mat­i­cal no­ta­tion?

The prob­lem is that Solomonoff in­duc­tion is an es­sen­tially opaque model.

Think of it as a black box: you put in a string of bits rep­re­sent­ing your past ob­ser­va­tions and it gives you a prob­a­bil­ity dis­tri­bu­tion on string of bits rep­re­sent­ing your pos­si­ble fu­ture ob­ser­va­tions. If you open the lid of the box, you will see many (ideally in­finitely many) com­puter pro­grams with ar­bi­trary struc­ture. There is no easy way to map that model to a prob­a­bil­ity dis­tri­bu­tion on non-di­rectly ob­serv­able world state vari­ables such as “the num­ber of peo­ple al­ive”.

Isn’t that kinda the point?

My in­ter­pre­ta­tion is that Yud­kowsky as­sumes Solomonoff in­duc­tion es­sen­tially a pri­ori and thus is puz­zled by the dilemma it allegedly yields. My point is that:

It’s not ob­vi­ous that Solomonoff in­duc­tion ac­tu­ally yeilds that dilemma.

I’ve been ar­gu­ing about this with a friend re­cently [well, a ver­sion of this—I don’t have any prob­lems with ar­bi­trar­ily large num­ber of peo­ple be­ing cre­ated and kil­led, un­less the man­ner of their death is un­pleas­ant enough that the nega­tive value I as­sign to it ex­ceeds the pos­i­tive value of life].

He says that he can be­lieve the per­son we are talk­ing to has Agent Smith pow­ers, but thinks that the more the Agent Smith promises, the less likely it is to be true, and this de­creases faster the more that is promised, so that the prob­a­bil­ity that Agent Smith has the pow­ers to cre­ate and kill [in an un­pleas­ant man­ner] Y peo­ple mul­ti­plied by Y tends to zero as Y tends to in­finity. . So the net ex­pec­tancy tends to­wards zero. I dis­agree with this: I be­lieve that if you as­sign prob­a­bil­ity X to the claim that the per­son you are talk­ing to is gen­uinely from out­side the Ma­trix [and that you’re in the Ma­trix], then the prob­a­bil­ity that Agent Smith has the pow­ers to cre­ate and kill [in an un­pleas­ant man­ner] Y peo­ple mul­ti­plied by Y tends to in­finity as Y tends to in­finity.

Now, I think we can break this down fur­ther to find the root cause of our dis­agree­ment [this doesn’t feel like a fun­da­men­tal be­lief]: does any­one have any sug­ges­tions for how to go about do­ing this? We be­gan to ar­gue about en­tropy and the chance for Agent Smith to have found a way [from out­side the Ma­trix = all our physics doesn’t ap­ply to him] to re­verse it, but I think we went down­hill from there.

Edit: Looks like I was as­sum­ing prob­a­bil­ity dis­tri­bu­tions for which Lim (Y → in­finity) of Y*P(Y) is well defined. This turns out to be mono­tonic se­ries or some similar class (thanks shinoteki).

I think it’s still the case that a prob­a­bil­ity dis­tri­bu­tion that would lead to TraderJoe’s claim of P(Y)*Y tend­ing to in­finity as Y grows would be un-nor­mal­iz­able. You can of course have a dis­tri­bu­tion for which this limit is un­defined, but that’s a differ­ent story.

Coun­terex­am­ple:
P(3^^^...3)(n “^”s) = 1/​2^n
P(any­thing else) = 0
This is nor­mal­ized be­cause the sum of a ge­o­met­ric se­ries with de­creas­ing terms is finite.
You might have been think­ing of the fact that if a prob­a­bil­ity dis­tri­bu­tion on the in­te­gers is mono­tone de­creas­ing (i.e. if P(n)>P(m) then n <m) then P(n) must de­crease faster than 1/​n. How­ever, a com­plex­ity-based dis­tri­bu­tion will not be mono­tone be­cause some big num­bers are sim­ple while most of them are com­plex.

The prob­lem seems to van­ish if you don’t ask “What is the ex­pec­ta­tion value of util­ity for this de­ci­sion, if I do X”, but rather “If I changed my men­tal al­gorithms so that they do X in situ­a­tions like this all the time, what util­ity would I plau­si­bly ac­cu­mu­late over the course of my en­tire life?” (“How much util­ity do I get at the 50th per­centile of the util­ity prob­a­bil­ity dis­tri­bu­tion?”) This would have the fol­low­ing re­sults:

For the limit case of de­ci­sions where all pos­si­ble out­comes hap­pen in­finitely of­ten dur­ing your life­time, you would de­cide ex­actly as if you wanted to max­i­mize ex­pec­ta­tion value in an in­di­vi­d­ual case.

You would not de­cide to give money to Pas­cals’ mug­ger, if you don’t ex­pect that there are many fun­da­men­tally differ­ent sce­nar­ios which a mug­ger could tell you about: If you give a 5 % chance to the sce­nario de­scribed by Pas­cals mug­ger and be­lieve that this is the only sce­nario which, if true, would make you give 5 $ to some per­son, you would not give the money away.

In con­trast, if you be­lieve that there are 50 differ­ent mug­ging sce­nar­ios which peo­ple will tell you dur­ing your life to pas­cal-mug you, and you as­sign an in­de­pen­dent 5 % chance to all of them, you would give money to a mug­ger (and ex­pect this to pay off oc­ca­sion­ally).

As I see it, the mug­ger seems to have an ex­tremely bad hand to play.

If you eval­u­ate the prob­a­bil­ity of the state­ment ‘I will kill one per­son if you don’t give me five dol­lars,’ as be­ing some­thing that stands in a re­la­tion­ship to the oc­cur­rence of such threat be­ing car­ried through on, and sim­ply mul­ti­ply up from there un­til you get to 3^^^^3 peo­ple, then you’re go­ing to end up with prob­lems.

How­ever, that sort of sim­plifi­ca­tion – treat­ing all the ev­i­dence as lo­cat­ing the same thing, only works for low mul­ti­ples. (Which I’d imag­ine is why it feels wrong when you start talk­ing about large num­bers.) If you eval­u­ate the ev­i­dence for and against differ­ent parts of the state­ment, then you can’t sim­ply scale it up as a whole with­out scal­ing up all the vari­ables that ev­i­dence at­taches to. The prob­a­bil­ity that the per­son will carry through on a threat to kill 3^^^3 peo­ple for five dol­lars is go­ing to zero out fairly quickly. You need to scale up the dol­lars asked to get all the bits of ev­i­dence to scale in pro­por­tion to each other.

To make the threat plau­si­ble the mug­ger would have to be ask­ing for a ridicu­lously large benefit for them­selves. And when you start ask­ing for that huge benefit then the com­puter sim­ply has to an­swer whether the re­sources can be put to bet­ter use el­se­where.

As it stands, how­ever, the vari­ables in the state­ment haven’t been prop­erly scaled to keep the ev­i­dence for and against the pro­posed mur­ders in a con­stant re­la­tion­ship. And, while it’s just about pos­si­ble that some­one will kill a few hun­dred peo­ple for five dol­lars (de­stroy­ing a train or the like would be a low in­vest­ment ex­er­cise) the prob­a­bil­ity rapidly ap­proaches zero as you in­crease the num­ber of peo­ple that you’re propos­ing to kill for five dol­lars.

By the time you’re talk­ing about 3^^^^3 lives the prob­a­bil­ity would have long since been re­duced to an ab­sur­dity. Which would then be com­pared against all the other things the FAI could do with five dol­lars, that have far higher prob­a­bil­ities, and sim­ply be dis­missed as a bad gam­ble. (Since over a great length of time the com­puter could rea­son­ably ex­pect to ap­proach the pre­dicted loss/​benefit ra­tio, re­gard­less of whether the mug­ger ac­tu­ally kil­led 3^^^^3 peo­ple.)

This com­ment thread has grown too large :). I have a thought that seems to me to be the right way to re­solve this prob­lem. On the one hand, the thought is ob­vi­ous, so it prob­a­bly has already been played out in this com­ment thread, where it pre­sum­ably failed to con­vince ev­ery­one. On the other hand, the thread is too large for me to di­gest in the time that I can rea­son­ably give it. So I’m hop­ing that some­one more fa­mil­iar with the con­ver­sa­tion here will tell me where I can find the sub-thread that ad­dresses my point. (I tried some ob­vi­ous word-searches, and noth­ing came up.)

Any­way, here is my point. I can see that the hy­poth­e­sis that 3^^^^3 peo­ple are be­ing tor­tured might be sim­ple enough so that the Solomonoff prior is high enough so that the AI would give in to the mug­ger, if the AI were us­ing an un-up­dated Solomonoff prior. But the AI is al­lowed to up­date, right? And, from what the AI knows about hu­mans, it can see that the low com­plex­ity of 3^^^^3 also makes it more prob­a­ble that a “philoso­pher out for a fast buck” would choose that num­ber.

So, the sim­plic­ity of 3^^^^3 con­tributes to both the hy­poth­e­sis of a real tor­turer and the hy­poth­e­sis of the liar.

And if, af­ter tak­ing all this into ac­count, the AI still com­putes a high ex­pected util­ity for giv­ing in to the mug­ger, well, then I guess that that is re­ally what it ought to do (as­sum­ing that it shares my util­ity func­tion). But is there any rea­son to think that this is likely? Does it fol­low just from Eliezer’s ob­ser­va­tion that “the util­ity of a Tur­ing ma­chine can grow much faster than its prior prob­a­bil­ity shrinks”? After all, it’s the up­dated prob­a­bil­ity that re­ally mat­ters, isn’t it?

The prob­a­bil­ity of some ac­tion cost­ing delta-util­ity x and re­sult­ing in delta-util­ity y, where y >> x, is low. The Anti Gratis Din­ing mod­ifier is x/​y. Th­ese things I con­jec­ture, any­ways.

The ap­ple-sale­speep who says, “Give me $0.50, and I will give you an ap­ple” is quite be­liev­able, un­like the ap­ple-sale­speep who claims, “Give me $3.50, and I will give ap­ples to all who walk the Earth”. We un­der­stand how buy­ing an ap­ple gets us an ap­ple, but we know far less about im­ple­ment­ing global ap­ple dis­tri­bu­tion.

Sup­pose I have a Holy Hand Gre­nade of FAI, which has been care­fully proofed by all the best math­e­mat­i­ci­ans, pro­gram­mers, and philoso­phers, and I am (of course) amongst them. And am ran­domly se­lected to ac­ti­vate it! Sadly, there is an ant caught in the pin. I can not de­lay to ex­tri­cate it, for that means more deaths left un­pre­vented. I pull the pin and kill the ant any­ways.

So, the more un­der­stand­ing you have about the situ­a­tion at hand, the less the AGD fac­tor ap­plies to the situ­a­tion.

As­sume that the ba­sic rea­son­ing for this is true, but no­body ac­tu­ally does the mug­ging. Since the prob­a­bil­ity doesn’t ac­tu­ally make a sig­nifi­cant differ­ence to the ex­pected util­ity, I’ll just sim­plify and say there equal.

The to­tal ex­pected marginal util­ity, as­sum­ing you’re equally likely to save or kill the peo­ple, would be (3^^^3 − 3^^^3) + (3^^^^3 − 3^^^^3) + (3^^^^^3 − 3^^^^^3) + … = 0. At least, it would be if you count it by al­ter­nat­ing with sav­ing and kil­ling. You could also count it as 3^^^3 + 3^^^^3 − 3^^^3 + 3^^^^^3 − 3^^^^3 + … = in­finity. Or you could count it as −3^^^3 − 3^^^^3 + 3^^^3 − 3^^^^^3 + 3^^^^3 - … = -in­finity. Or you could even do 3^^^3 − 3^^^3 + 3^^^^3 − 3^^^^3 + 3^^^^^3 − 3^^^^^3 + … (with­out paren­the­ses) which doesn’t even con­verge to any­thing.

You could also con­struct hy­po­thet­i­cal pos­si­bil­ity sets where you can set it to add to any given num­ber by re­ar­rang­ing the pos­si­bil­ities.

It’s one thing when or­der mat­ters for talk­ing about to­tal util­ity of an in­finitely long uni­verse. It at least has an or­der, as­sum­ing you don’t mind aban­don­ing spe­cial rel­a­tivity, but what or­der are you even sup­posed to count ex­pected util­ity in?

I figure the only way out of this is to use a prior that de­creases with ex­pected util­ity faster than those for­mu­la­tions of Oc­cam’s ra­zor would sug­gest. I don’t like the idea of do­ing this, but not do­ing so just doesn’t add up.

Our best un­der­stand­ing of the na­ture of the “simu­la­tion” we call re­al­ity has this con­cept we call “cause and effect” in place. So when some­thing hap­pens it has non-zero (though nigh in­finitely small) effects on ev­ery­thing else in ex­is­tence (pro­gres­sively smaller effect with each de­gree of sep­a­ra­tion).

The effect that af­fect­ing 3^^^3 things (re­gard­less of type or clas­sifi­ca­tion) has on other things (even if the in­di­vi­d­ual effects of af­fect­ing one thing would be ex­tremely small) would be non-triv­ial (enor­mously large even af­ter a pos­i­tively lu­dicrous de­gree of sep­a­ra­tions).

Once you con­sider the level of effect that this would have on the whole “simu­la­tion” you are forced to con­sider ba­si­cally all pos­si­ble fu­tures. You have nigh-in­finite good (when these things are re­moved/​effected you end up with utopia and a range of all pos­si­ble net benefits for the whole of the simu­la­tion) and nigh-in­finite penalty (when these things are re­moved/​effected you end up with hell and a range of all pos­si­ble net losses for the whole of the simu­la­tion). I can­not fore­see how an AI can pos­si­bly have enough pro­cess­ing power to over­come the va­gary be­ing un­able to pre­dict all pos­si­ble fu­tures fol­low­ing the event.

More­over, I per­son­ally balk at the as­sump­tion of that level of re­spon­si­bil­ity. It is for the same rea­son that I balk at time travel sce­nar­ios. I re­fuse to be re­spon­si­ble for what­ever changes are wrought across all of re­al­ity (which in sum be­come quite large when you con­sider a Vast pos­si­bly in­finite uni­verse re­gard­less of how “small” the ini­tial event seems).

Also does the prob­a­bil­ity as­sign­ment take into ac­count the like­li­hood of the ac­tor in ques­tion ap­proach­ing you? As­sum­ing there are 3^^^3 peo­ple (minds), then surely the prob­a­bil­ity as­sign­ment of ap­proach­ing you speci­fi­cally must be ad­justed ac­cord­ingly. I un­der­stand that “some­body has to be ap­proached,” but surely no one here is will­ing to con­tend that any of us have traits which are so ex­cep­tional that they can­not be found in­side of a pop­u­la­tion which is 3^^^3 in size?

Should I think the uni­verse is prob­a­bly a coarse-grained simu­la­tion of my mind rather than real quan­tum physics, be­cause a coarse-grained hu­man mind is fifty(?) or­ders of mag­ni­tude cheaper than real quan­tum physics? Should I think the galax­ies are tiny lights on a painted back­drop, be­cause that Tur­ing ma­chine would re­quire less space to com­pute?

I think a large uni­verse full of ran­domly scat­tered mat­ter is much more prob­a­ble than a small uni­verse that con­sists of a work­ing hu­man mind and lit­tle else.

Be­fore I get go­ing, please let me make clear that I do not
un­der­stand the math here (even Eliezer’s in­tu­itive bayesian pa­per
defeated me on the first pass, and I haven’t yet had the courage to
take a sec­ond pass), so if I’m Miss­ing The Point(tm), please tell
me.

It seems to me that what’s miss­ing is talk­ing about the prob­a­bil­ity
of given level of re­source­ful­ness of the mug­ger. Let me ’splain.

If I ask the mug­ger for more de­tail, there are a wide va­ri­ety of
differ­ent vari­ables that de­ter­mine how re­source­ful the mug­ger claims
to be. The mug­ger could, upon fur­ther ques­tion­ing, re­veal that all
the death events are the same en­tity be­ing kil­led in the same way,
which I call one death; given the un­likely­hood of the mug­ger tel­ling
the truth in the first place, I’d not pay. Similar­ily, the mug­ger
could re­veal that the deaths, while of dis­tinct en­tities, hap­pen one
at a time, and may even in­clude time for the en­tities to grow up and
be­come func­tion­ing adults (i.e. one death ev­ery 18 years), in which
case I can al­most cer­tainly put the money to bet­ter use by giv­ing it
to SIAI.

On the other end of the scale, the mug­ger can claim in­finite
re­sources, so that the can com­plete the deaths (of en­tirely dis­tinct
en­tities, which have lives, grow up, and then are slaugh­tered) in an
in­finitely small amount of time. If the mug­ger does so, they don’t
get the money, be­cause I as­sign an in­finitely small value to
prob­a­bil­ity of the mug­ger hav­ing in­finite re­sources. Yes, the
mug­ger may live in a mag­i­cal uni­verse where hav­ing in­finite
re­sources is easy, but you don’t get a
get-out-of-prob­a­bil­ity-as­sign­ment-free card be­cause you say the word
“magic”; I still have to base my prob­a­bil­ity as­sign­ment of your
claims on the world around me, in which we don’t yet have the
com­put­ing power to simu­late even one hu­man in real time (ig­nor­ing
the soft­ware prob­lem en­tirely).

Between these two ex­tremes is an en­tire range of pos­si­bil­ities. The
im­por­tant part here is that the prob­a­bil­ity I as­sign to “the mug­ger
is ly­ing” is go­ing to in­crease ex­po­nen­tially as their claim of
re­sources in­creases. Un­til the claimed rate of birth, grow­ing, and
dy­ing ex­ceeds the rate of deaths we already have here on Earth, I
don’t care, be­cause I can bet­ter spend the money here. After we
reach that point (~150K per day), I don’t care, be­cause my
prob­a­bil­ity is some­thing like 1/​O(2^n) (Com­puter Science big-O
there; sorry, that’s my back­ground) where n is the mul­ti­ple of
com­puter re­sources claimed over “one mind in re­al­time”, so n is,
umm, 150K deaths per day = 53400000 deaths per year, 18 years for
each per­son, so I think n is 961200000?. That’s not even count­ing
the prob­a­bil­ity dis­count due to the ridicu­lous­ness of the whole
claim.

The point here is that I don’t care about the 3^^^^3 num­ber; I only
care about the claimed deaths per unit time, how that com­pares to
the num­ber of peo­ple cur­rently dy­ing on Earth (on whom I know I
can well-spend the $5) and the claimed re­source­ful­ness of the
mug­ger. By the time we get up to where the 3^^^^3 num­ber mat­ters,
i.e. “I can kill one-one­mil­lionth of 3^^^^3 peo­ple ev­ery re­al­time
year”, my prob­a­bil­ity as­sign­ment for their claimed re­source­ful­ness
is so in­cred­ibly low (and so in­cred­ibly lower than the num­bers they
are throw­ing at me) that I laugh and walk away.

There is not, as far as I can tell, a sweet spot where the num­ber of
lives I might save by giv­ing the mug­ger the $5 is enough more than
the num­ber of peo­ple cur­rently dy­ing on Earth to offset the
ridicu­lously low prob­a­bil­ity I’d be as­siging to the mug­ger’s
re­source­ful­ness. I’d rather give the $5 to SIAI.

Nick Tar­leton,
Yes, it is prob­a­bly cor­rect that one should de­vote sub­stan­tial re­sources to low prob­a­bil­ity events, but what are the odds that the uni­verse is not only a simu­la­tion, but that the con­tain­ing world is much big­ger; and, if so, does the uni­verse just not count, be­cause it’s so small? The bounded util­ity func­tion prob­a­bly reaches the op­po­site con­clu­sion that only this uni­verse counts, and maybe we should keep our am­bi­tions limited, out of fear of at­tract­ing at­ten­tion.

Even if there is no­body cur­rently mak­ing a bignum-level threat, maybe the util­ity-max­i­miz­ing thing to do is to de­vote sub­stan­tial re­sources to search for low-prob­a­bil­ity, high-im­pact events and stop or en­courage them de­pend­ing on the util­ity effect. After all, you can’t say the prob­a­bil­ity of ev­ery pos­si­bil­ity as bad as kil­ling 3^^^^3 peo­ple is zero.

Eliezer, I think Robin’s guess about man­gled wor­lds is in­ter­est­ing, but ir­rele­vant to this prob­lem. I’d guess that for you, P(man­gled wor­lds is cor­rect) is much smaller than P(it’s right that I care about peo­ple in pro­por­tion to the weight of the branches they are in). So Robin’s idea can’t ex­plain why you think that is the right thing to do.

Nick, your pa­per doesn’t seem to men­tion the pos­si­bil­ity of dis­count­ing peo­ple by their al­gorith­mic com­plex­ity. Is that an op­tion you con­sid­ered?

Stephen, no prob­lem. In­ci­den­tally, I share your doubt about the op­ti­mal­ity of op­ti­miz­ing ex­pected util­ity (though I won­der whether there might be a the­o­rem that says any­thing co­her­ent can be squeezed into that form).

CC, in­deed there are many in­fini­ties (not merely in­finitely many, not merely more than we can imag­ine, but more than we can de­scribe), but so what? Any sort of in­finite util­ity, cou­pled with a nonzero finite prob­a­bil­ity, leads to the sort of difficulty be­ing con­tem­plated here. Higher in­fini­ties nei­ther help with this nor make it worse, so far as I can see. (I sup­pose it’s worth con­sid­er­ing that it might con­ceiv­ably make sense for an agent’s util­ities to live in some struc­ture “richer” than the usual real num­bers, like Con­way’s sur­real num­bers, where there are in­fini­ties and in­finites­i­mals aplenty. But I think there are tech­ni­cal difficul­ties with this sort of scheme; for in­stance, do­ing calcu­lus over the sur­re­als is prob­le­matic. And of course we ac­tu­ally only have finite brains, so what­ever util­ities we have are pre­sum­ably rep­re­sentable in finite terms even if they fea­ture in­com­men­su­ra­bil­ities of the sort that might be mod­el­led in terms of some­thing like the sur­real num­bers. But all this is a sep­a­rate is­sue.)

Stephen, you can’t have been agree­ing with me about that since I didn’t say it, even though for some rea­son I don’t un­der­stand (per­haps I was very un­clear, but I don’t see how) Eliezer chose to in­ter­pret me do­ing so and in­deed go­ing fur­ther to say that it isn’t eth­i­cally dis­tinct.

Eliezer, cre­at­ing an­other per­son in ad­di­tion to 100 su­per happy peo­ple do not re­duce the mea­sures of those 100 su­per happy peo­ple. For ex­am­ple, sup­pose those 100 su­per happy peo­ple are liv­ing in a clas­si­cal uni­verse com­puted by some TM. The min­i­mal in­for­ma­tion needed to lo­cate each per­son in this uni­verse is just his time/​space co­or­di­nate. Creat­ing an­other per­son does not cause an in­crease in that in­for­ma­tion for the ex­ist­ing peo­ple.

I was es­sen­tially agree­ing with you that kil­ling 3^^^^^3 vs 3^^^^3 pup­pies may not be eth­i­cally dis­tinct. I would call this scope in­sen­si­tivity. My sug­ges­tion was that scope in­sen­si­tivity is not nec­es­sar­ily always un­jus­tified.

Tiiba, keep in mind that to an al­tru­ist with a bounded util­ity func­tion, or with any other of Peter’s caveats, in may not “make perfect sense” to hand over the five dol­lars. So the prob­lem is solve­able in a num­ber of ways, the prob­lem is to come up with a solu­tion that (1) isn’t a hack and (2) doesn’t cre­ate more prob­lems than in solves.

Any­way, like most peo­ple, I’m not a com­plete util­i­tar­ian al­tru­ist, even at a philo­soph­i­cal level. Ex­am­ple: if an AI com­plained that you take up too much space and are mopey, and offered to kill you and re­place you with two happy midgets, I would feel no guilt about re­fus­ing the offer, even if the AI could guaran­tee that over­all util­ity would be higher af­ter the swap.

IIRC, Peter de Blanc told me that any con­sis­tent util­ity func­tion must have an up­per bound (mean­ing that we must dis­count lives like Steve sug­gests). The prob­lem dis­ap­pears if your up­per bound is low enough. Hope­fully any re­al­is­tic util­ity func­tion has such a low up­per bound, but it’d still be a good idea to solve the gen­eral prob­lem.

Nick, please see my blog (just click on my name). I have a post about this.

Even if you don’t ac­cept 1 and 2 above, there’s no rea­son to ex­pect that the per­son is tel­ling the truth. He might kill the peo­ple even if you give him the $5, or con­versely he might not kill them even if you don’t give him the $5.

To put it an­other way, con­di­tional on this nonex­is­tent per­son hav­ing these nonex­is­tent pow­ers, why should you be so sure that he’s tel­ling the truth? Per­haps you’ll only get what you want by not giv­ing him the $5. To put it math­e­mat­i­cally, you’re com­put­ing pX, where p is the prob­a­bil­ity and X is the out­come, and you’re say­ing that if X is huge, then just about any nonzero p will make pX be large. But you’re for­get­ting two things: first, if you have the imag­i­na­tion to imag­ine X to be su­per-huge, you should be able to have the imag­i­na­tion to imag­ine p to be su­per-small. (I.e., if you can talk about 3^^^^3, you can talk about 1/​3^^^^3.) Se­cond, once you al­low these hy­po­thet­i­cal su­per-large X’s, you have to ac­knowl­edge the pos­si­bil­ity that you got the sign wrong.

Mitchell, it doesn’t seem to me like any sort of ac­cu­rate many-wor­lds prob­a­bil­ity calcu­la­tion would give you a prob­a­bil­ity any­where near low enough to can­cel out 3^^^^3. Would you dis­agree? It seems like there’s some­thing else go­ing on in our in­tu­itions. (Speci­fi­cally, our in­tu­itions that an good FAI would need to agree with us on this prob­lem.)

I think the an­swer to this ques­tion con­cerns the Kol­mogorov com­plex­ity of var­i­ous things, and the util­ity func­tion as well. What is the Kol­mogorov com­plex­ity of 3^^^3 simu­lated peo­ple? What is the com­plex­ity of the pro­gram to gen­er­ate the simu­lated peo­ple? What is the com­plex­ity of the threat, that for each of these 3^^^3 peo­ple, this par­tic­u­lar man is ca­pa­ble of kil­ling each of them? What sort of prior prob­a­bil­ity do we as­sign to “this man is ca­pa­ble of simu­lat­ing 3^^^3 peo­ple, kil­ling each of them, and will­ing to do so for $5”?

Similarly, the util­ity func­tion for this calcu­la­tion needs to be defined. Utility is usu­ally calcu­lated with de­creas­ing marginal re­turns, such that we usu­ally are de­scribed as hav­ing scope in­sen­si­tivity. Like­wise, we at­tach dis­pro­por­tionately lower util­ity to things with small chances. We’d prob­a­bly also have lower util­ity for these 3^^^3 simu­lated peo­ple on ac­count of their be­ing simu­lated, be­ing gen­er­ated by a pro­gram with very low Kol­mogorov com­plex­ity (ie, lack in­di­vi­d­u­al­ity and “re­al­ness”), and ex­ist­ing at the whim of a seem­ingly cruel and crazy per­son (mean­ing they’re prob­a­bly doomed any­ways).

There’s a few other things to con­sider, such as the prob­a­bil­ity that one of the 3^^^3 simu­lated peo­ple will be able to res­cue his peo­ple, and the prob­a­bil­ity that some­one else will come and threaten to kill 4^^^^4 peo­ple and de­mand enough money that I can’t af­ford to part with the $5 for only 3^^^3 peo­ple.

Over­all, I think the prob­lem is poorly defined for the above rea­sons, but per­haps my main ob­jec­tion would be the at­tempt to use a for­mal­ism in­tended to re­duce un­nec­es­sary com­plex­ity, as a guide to whether some­thing is true.

To solve this prob­lem, the AI would need to calcu­late the prob­a­bil­ity of the claim be­ing true, for which it would need to calcu­late the prob­a­bil­ity of 3^^^^3 peo­ple even ex­ist­ing. Given what it knows about the ori­gins and rate of re­pro­duc­tion of hu­mans, wouldn’t the prob­a­bil­ity of 3^^^^3 peo­ple even ex­ist­ing be ap­prox­i­mately 1/​3^^^^3? It’s as you said, mul­ti­ply or di­vide it by the num­ber of char­ac­ters in the bible, it’s still nearly the same damned in­com­pre­hens­ably large num­ber. Un­less you are will­ing to ar­gue that there are some bizarre prop­er­ties of the other uni­verse that would al­low so many peo­ple to spon­ta­neously arise from noth­ing- but this is yet an­other ex­plana­tory as­sump­tion, and one that I see no way of as­sign­ing a prob­a­bil­ity to.

OK, one more try. First, you’re pick­ing 3^^^^3 out of the air, so I don’t see why you can’t pick 1/​3^^^^3 out of the air also. You’re say­ing that your pri­ors have to come from some rigor­ous pro­ce­dure but your util­ity comes from sim­ply tran­scribing what some dude says to you. Se­cond, even if for some rea­son you re­ally want to work with the util­ity of 3^^^^3, there’s no good rea­son for you not to con­sider the pos­si­bil­ity that it’s re­ally −3^^^^3, and so you should be do­ing the op­po­site. The is­sue is not that two huge num­bers will ex­actly can­cel out; the point is that you’re mak­ing up all the num­bers here but are ar­tifi­cially con­strain­ing the ex­pected util­ity differ­en­tial to be pos­i­tive.

If I re­ally wanted to con­sider this ex­am­ple re­al­is­ti­cally, I’d say that this guy has no magic pow­ers, so I wouldn’t worry about him kil­ling 3^^^^3 peo­ple or what­ever. A slightly more re­al­is­tic sce­nario would be some­thing like a guy with a bomb in a school, in which case I’d defer to the ex­perts (pre­sum­ably who­ever in the po­lice force deals with peo­ple like that) on their judg­ment of how best to calm him down. There I could see an (ap­prox­i­mate) prob­a­bil­ity calcu­la­tion be­ing rele­vant, but, again, they key thing would be whether giv­ing him $5 (or what­ever) would make him more or less likely to set the fuse. It wouldn’t be ap­pro­pri­ate to say a pri­ori that it could only help.

OK, one more try. First, you’re pick­ing 3^^^^3 out of the air, so I don’t see why you can’t pick 1/​3^^^^3 out of the air also.

You’re not pick­ing 3^^^^3 out of the air. The other guy told you that num­ber.

You can’t pick prob­a­bil­ities out of the air. If you could, why not just set the prob­a­bil­ity that you’re God to one?

If I re­ally wanted to con­sider this ex­am­ple re­al­is­ti­cally, I’d say that this guy has no magic pow­ers, so I wouldn’t worry about him kil­ling 3^^^^3 peo­ple or what­ever.

With what prob­a­bil­ity? Would you give money to a mug­ger if their gun prob­a­bly isn’t loaded? Is this ex­am­ple fun­da­men­tally differ­ent?

There I could see an (ap­prox­i­mate) prob­a­bil­ity calcu­la­tion be­ing rele­vant, but, again, they key thing would be whether giv­ing him $5 (or what­ever) would make him more or less likely to set the fuse.
Even if it comes out as less, the para­dox still ex­ists. It’s just that then you can’t give him 5$. The only way to get out of it is for the prob­a­bil­ities to can­cel out to within one part in 3^^^^3, which is ab­surd.

I think you’re on to some­thing, but I think the key is that some­one claiming be­ing able to in­fluence 3^^^^3 of any­thing, let alone 3^^^^3 “peo­ple”, is such an ex­traor­di­nary claim that it would re­quire ex­traor­di­nary ev­i­dence of a mag­ni­tude similar to 3^^^^3, i.e. I bet we’re vastly un­der­es­ti­mat­ing the com­plex­ity of what our mug­ger is claiming.

Eliezer,
I’d like to take a stab at the in­ter­nal crite­rion ques­tion.
One differ­er­ence be­tween me and the pro­gram you de­scribe is that I have a hoped for fu­ture. Say “I’d like to play golf on Wed­nes­day.” Now, I could calcu­late the odds of Wed­nes­day not ac­tu­ally ar­riv­ing (nu­clear war,as­ter­oid im­pact...), or me not be­ing al­ive to see it (sud­den heartat­tack...), and I would get an an­swer greater than zero. Why don’t I op­er­ate on those non-zero prob­a­bil­ities? (The other differ­ence be­tween me and the pro­gram you de­scribe)
I think it has to do with faith. That is I have faith that my hoped for fu­ture will oc­cur, or at least some sem­blance of it. I seem to have this faith de­spite pre­vi­ous losses.
Take the field of AI. There is a hoped for fu­ture, a com­puter will demon­strate in­tel­li­gence, some hope the ma­chine will be­come con­scious. There is a faith that “we can solve these prob­lems” I’m not sure the ma­chine you de­scribe would have ei­ther char­ac­ter­is­tic.
I don’t know how to for­mal­ize this, but it seems an im­por­tant as­pect of the situ­a­tion.

Here’s one for you: Lets as­sume for ar­gue­ment’s sake that “hu­mans” could in­clude hu­man coscious­nesses, not just breath­ing hu­mans. Then, if a uni­verse with 3^^^^3 “hu­mans” ac­tu­ally ex­isted, what would be the odds that they were NOT all copies of the same par­a­sitic con­scious­ness?

Well… I think we act diffrently from the AI be­cause we not only know Pas­cals Mug­ging, we know that it is known. I don’t see why an AI could not know the knowl­edge of it, though, but you do not seem to con­sider that, which might sim­ply show that it is not rele­vant, as you, er, seem to have given this some thought...

Eliezer
Sorry to say (be­cause it makes me sound cal­lous), but if some­one can and is will­ing to cre­ate and then de­stroy 3^^^3 peo­ple for less than $5, then there is no value in life, and definitely no moral struc­ture to the uni­verse. The cre­ation and de­struc­tion of 3^^^3 peo­ple (or more) is prob­a­bly hap­pen­ing all the time. There­fore the AI is safe de­clin­ing the wa­ger on purely self­ish grounds.

So, if there is some­one out there com­mit­ting gre­vi­ous holo­causts (if we use re­al­is­tic num­bers like “10 mil­lion deaths”, “20 billion deaths”, the prob­a­bil­ity of this is near 1), then none of us have any moral obli­ga­tions ever?

I guess so. It’s an in­ter­est­ing idea—kind of like so­cial co­op­er­a­tion prob­lems like re­cy­cling; if too many other peo­ple are not do­ing it, then there isn’t much point in do­ing it your­self. Ap­ply­ing it to moral­ity is in­ter­est­ing. But wrong, I think.

What pos­si­ble rea­son would you have to as­sume that? If we’re talk­ing about an ac­tu­ally in­tel­li­gent AI, it’d pre­sum­ably be as smart as any other in­tel­li­gent be­ing(like, say, a hu­man). If we’re talk­ing about a dumb pro­gram, it can take into ac­count any­thing that we want it to take into ac­count.

Peo­ple say the fact that there are many gods neu­tral­izes Pas­cal’s wa­ger—but I don’t un­der­stand that at all. It seems to be a to­tal non se­que­tor. Sure, it opens the door to other wa­gers be­ing valid, but that is a differ­ent is­sue.

Lets say I have a sim­ple game against you where, if I choose 1 I win a lotto ticket and if I choose 0 I loose. There is also a num­ber of other games ta­bles around the room with peo­ple win­ning or not win­ning lotto tick­ets. If I want to win the lotto, what num­ber should I pick?

Also I don’t tink there is a fundi­men­tal is­sue with hav­ing favour with Allah, Christ and Zeus si­mul­tan­iously. (so you could ac­tu­aly win, then get up and go play at an­other table—al­though there would be a time cost to that).

Now there is the more de­tailed ar­gu­ment where you ar­gue that a god who de­sired you dis­be­lieve in him and op­pose his will is equally likely to one that de­sires that you be­lieve in him and sup­ports his will. But as long as there is any im­perfec­tion in the mir­ror then there is a Pas­cal’s wa­ger to be had.

What if a philoso­pher tries Pas­cal’s Mug­ging on the AI for a joke, and the tiny prob­a­bil­ities of 3^^^^3 lives be­ing at stake, over­ride ev­ery­thing else in the AI’s calcu­la­tions?

Sup­pose that de­pends on how he calcu­lates the prob­a­bil­ity of the threat of the mug­ger. The very act of giv­ing a spe­cific prob­a­bil­ity to a threat like that opens one up to an in­finite risk (i.e. that they will de­mand in­finite things in ex­change for in­finity x 3^^^^3 lives). So this is a bit like com­par­ing what I might call naive util­i­tar­i­anism (where one doesn’t con­sider the wider effects of one’s acts and rules) with pure util­i­tar­i­anism (where one takes ev­ery­thing into ac­count).

Whether that neu­tral­izes Pas­cal’s wa­ger re­lates to how one re­solves the mir­ror is­sue I men­tioned. If that pro­duces a tidy re­sult then the prob­lem above doesn’t oc­cur.

There is one prob­lem with hav­ing fa­vor of sev­eral gods si­mul­ta­neously:

Ex­o­dus 20:3 “You shall have no other gods be­fore me.”

In fact, one could ar­gue that be­ing a true or­tho­dox chris­tian would lead you to the mus­lim, hindu, protes­tant and sci­en­tol­ogy (etc.) hells, while choos­ing any­one of them would sub­tract that hell but add the hell of what­ever re­li­gion you left...

There aren’t 3^^^3 peo­ple and there is no ma­chine that can simu­late even one per­son, let alone that many peo­ple.

No­body has such magic pow­ers.

Even if you don’t ac­cept 1 and 2 above, there’s no rea­son to ex­pect that the per­son is tel­ling the truth. He might kill the peo­ple even if you give him the $5, or con­versely he might not kill them even if you don’t give him the $5.