Near-term focus, robustness, and flow-through effects

I re­cently read Open Phil’s 2018 cause pri­ori­ti­za­tion up­date, which de­scribes their new policy of split­ting fund­ing into buck­ets that make dona­tions ac­cord­ing to par­tic­u­lar wor­ld­views. They di­vide wor­ld­views into an­i­mal-in­clu­sive vs hu­man-cen­tric, as well as long-term vs near-term fo­cused. I think these two fac­tors cover a lot of the var­i­ance with EA view­points, so they’re pretty in­ter­est­ing to ex­am­ine. As some­one who’s gen­er­ally pretty fo­cused on the long-term, I found this a good jump­ing-off point for think­ing about ar­gu­ments against it, as well as gen­eral con­cerns about ro­bust­ness and flow-through effects.

The dis­cus­sion of near-term fo­cus brings up many good points I’ve heard against naive long-term-EV max­i­miza­tion. It’s hard to pre­dict the fu­ture; it means you can’t get feed­back on how your ac­tions turn out; act­ing on these pre­dic­tions has a mixed or bad track record; it can in­volve con­fus­ing moral ques­tions like the value of cre­at­ing new be­ings. Aiming sim­ply to pre­serve civ­i­liza­tion runs the risk of car­ry­ing bad val­ues into the far fu­ture; aiming at im­prov­ing hu­man val­ues or avert­ing worst cases gives you an even harder tar­get to hit.[1]

On a more in­tu­itive level, it feels un­com­fortable to be swayed by ar­gu­ments that im­ply you can achieve as­tro­nom­i­cal amounts of value, es­pe­cially if you think you’re vuln­er­a­ble to per­sua­sion; if so, a suffi­ciently silver-tongued per­son can con­vince you to do any­thing. You can also couch this in terms of meta-prin­ci­ples, or out­side view on the class of peo­ple who thought they’d have an out­size im­pact on fu­ture util­ity if they did some­thing weird. (I’m not sure what the lat­ter would im­ply, ac­tu­ally; as Zhou En­lai never said, “What’s the im­pact of the French Revolu­tion? Too soon to tell.“)

I think these are mostly quite good ob­jec­tions; if other long-ter­mist EAs are like me, they’ve mostly heard of these ar­gu­ments, agreed with them, ad­justed a bit, and con­tinued to work on long-term pro­jects of some sort.

The part of me that most sym­pa­thizes with these points is the one that seeks ro­bust­ness and con­fi­dence in im­pact. It’s hard for me to adapt to cluster-think­ing, which I sus­pect un­der­lies strong near-ter­mist po­si­tions, so I mostly think of this as a con­strained op­ti­miza­tion prob­lem: ei­ther min­i­miz­ing max bad­ness given some con­straint on EV, or max­i­miz­ing EV - [ro­bust­ness penalty]. If you don’t in­clude a heavy time dis­count, though, I think it’s plau­si­ble that this still leads you to “long-term-y” in­ter­ven­tions, such as re­duc­ing in­ter­na­tional ten­sion or ex­pand­ing moral cir­cles of con­cern. This is partly due to the difficulty of ac­count­ing for flow-through effects. I con­fess I haven’t thought too much about those for short-term hu­man-fo­cused in­ter­ven­tions like global health and poverty, but my sense is that un­less you op­ti­mize fairly hard for good flow-through you’re likely to have a non­triv­ial chance of nega­tive effects.

Another way of think­ing about this is to con­sider what you should have done as an EA at some point in the past. It seems plau­si­ble that, while you might not be able to avert nu­clear or AI catas­tro­phe di­rectly in 1500, you could con­tribute to mean­ingful moral growth, or to differ­en­tial ad­vances in e.g. medicine (though now we’re already in the realm of plau­si­ble nega­tive flow-through, via ear­lier bioweapons → death, offense-fa­vor­ing dy­nam­ics, lack of norms against “WMDs” as a cat­e­gory). Maybe it’s more ob­vi­ous that minis­ter­ing to the poor and sick that you could would be the best thing?

I haven’t built up much knowl­edge or deep con­sid­er­a­tion about this, so I’m quite cu­ri­ous what you guys think. If you sup­port short-ter­mism, is it mainly out of ro­bust­ness con­cerns? How do you deal with flow-through un­cer­tainty in gen­eral, and how do you con­cep­tu­al­ize it, if naive EV max­i­miza­tion is in­ad­e­quate? Open Phil’s post sug­gests cap­ping the im­pact of an ar­gu­ment at 10-100x the num­ber of per­sons al­ive to­day, but choos­ing bench­marks/​thresh­olds/​trade­offs for this kind of thing seems difficult to do in a prin­ci­pled way.

[1] Another ob­ject-level point, due to AGB, is that some rea­son­able base rate of x-risk means that the ex­pected lifes­pan of hu­man civ­i­liza­tion con­di­tional on solv­ing a par­tic­u­lar risk is still hun­dreds or thou­sands of years, not the as­tro­nom­i­cal pat­ri­mony that’s of­ten used to jus­tify far-fu­ture in­ter­ven­tions. Of course, this ap­plies much less if you’re talk­ing about solv­ing an x-risk in a way that re­duces the long-term base rate sig­nifi­cantly, as a Friendly AI would.

If I re­call cor­rectly this pa­per by Tom Sit­tler also makes the point you para­phrased as “some rea­son­able base rate of x-risk means that the ex­pected lifes­pan of hu­man civ­i­liza­tion con­di­tional on solv­ing a par­tic­u­lar risk is still hun­dreds or thou­sands of years”, among oth­ers.

I think the ar­gu­ment was writ­ten up for­mally on the fo­rum, but I’m not find­ing it. I think it goes like if the chance of X risk is 0.1%/​year, the ex­pected du­ra­tion of hu­mans is 1000 years. If you de­crease the risk to 0.05%/​year, the du­ra­tion is 2000 years, so you have only added a mil­len­nium. How­ever, if you get safe AI and colonize the galaxy, you might get billions of years. But I would ar­gue if you re­duce the chance that nu­clear war de­stroys civ­i­liza­tion (from which we might not re­cover), then you in­crease the chances of get­ting safe AI and coloniza­tion, and there­fore you can at­tribute over­whelming value of miti­gat­ing nu­clear war.

> But I would ar­gue if you re­duce the chance that nu­clear war de­stroys civ­i­liza­tion (from which we might not re­cover), then you in­crease the chances of get­ting safe AI and coloniza­tion, and there­fore you can at­tribute over­whelming value of miti­gat­ing nu­clear war.

For clar­ity’s sake, I don’t dis­agree with this. This does mean that your ar­gu­ment for over­whelming value of miti­gat­ing nu­clear war is still pred­i­cated on de­vel­op­ing a safe AI (or some other way of mas­sively re­duc­ing the base rate) at a fu­ture date, rather than be­ing a self-con­tained ar­gu­ment based solely on nu­clear war be­ing an x-risk. Which is to­tally fine and rea­son­able, but a use­ful dis­tinc­tion to make in my ex­pe­rience. For ex­am­ple, it would now make sense to com­pare whether work­ing on safe AI di­rectly or work­ing on nu­clear war in or­der to in­crease the num­ber of years we have to de­velop safe AI is gen­er­at­ing bet­ter re­turns per effort spent. This in turn I think is go­ing to de­pend heav­ily on AI timelines, which (at least to me) was not ob­vi­ously an im­por­tant con­sid­er­a­tion for the value of work­ing on miti­gat­ing the fal­lout of a nu­clear war!

I should have said de­velop safe AI or colonize the galaxy, be­cause I think ei­ther one would dra­mat­i­cally re­duce the base rate of ex­is­ten­tial risk. The way I think about the value of nu­clear war miti­ga­tion be­ing af­fected by AI timelines is that if AI comes soon, there are fewer years that we are ac­tu­ally threat­ened by nu­clear war. This is one rea­son I only looked out about 20 years for my cost-effec­tive­ness anal­y­sis for al­ter­nate foods ver­sus AI. I think these risks could be cor­re­lated, be­cause one mechanism of far fu­ture im­pact of nu­clear war is worse val­ues end­ing up in AI (if nu­clear war does not col­lapse civ­i­liza­tion).