John_Maxwell_IV

Another ques­tion re­lated to Task Y: sup­pos­ing Task Y does ex­ist, would you rather peo­ple work­ing on Task Y think of them­selves as “Soft EAs”, or as peo­ple who are part of the “Task Y com­mu­nity”? For ex­am­ple, if eat­ing a ve­gan diet is Task Y, would you like ve­g­ans to start think­ing of them­selves as EAs due to their ve­g­anism? If ve­g­anism didn’t ex­ist already, and it was an idea that origi­nated from within the EA com­mu­nity, would it be best to spin it off or keep it in­ter­nal?

I can think of ar­gu­ments on both sides:

Maybe there’s already a large au­di­ence of peo­ple who have heard about EA and think it’s re­ally cool but don’t know how to con­tribute. If these peo­ple already ex­ist, we might as well figure out the best things for them to do. This isn’t nec­es­sar­ily an ar­gu­ment for ex­pan­sion of EA, how­ever. (It’s also not to­tally clear which di­rec­tion this con­sid­er­a­tion points in.)

If Task Y is a task where the ar­gu­ment for pos­i­tive im­pact is ab­struse & hard to fol­low, then maybe a “Task Y Move­ment” isn’t ever go­ing to get off the ground be­cause it lacks pop­u­lar ap­peal. Maybe the EA move­ment has more pop­u­lar ap­peal, and the EA move­ment’s pop­u­lar ap­peal can be di­rected into Task Y.

Some find the EA move­ment un­invit­ing in its elitism. Even on this fo­rum, re­port­edly the most elitist EA dis­cus­sion venue, a highly up­voted post says: “Many of my friends re­port that read­ing 80,000 Hours’ site usu­ally makes them feel de­mor­al­ized, alienated, and hope­less.” There have been gripes about the difficulty of get­ting grant money for EA pro­jects from grant­mak­ing or­ga­ni­za­tions af­ter it be­came known that “EA is no longer fund­ing-limited”. (I might be guilty of this griping my­self.) Do we want av­er­age Janes and Joes read­ing EA ca­reer ad­vice that Google soft­ware en­g­ineers find “very de­press­ing”? How will they feel af­ter learn­ing that some EAs are con­sid­ered 1000x as im­pact­ful as them?

I changed the sen­tence you men­tion to “If you want to un­der­stand pre­sent-day al­gorithms, the “pre-driven car” model of think­ing works a lot bet­ter than the “self-driv­ing car” model of think­ing. The pre­sent and past are the only tools we have to think about the fu­ture, so I ex­pect the “pre-driven car” model to make more ac­cu­rate pre­dic­tions.” I hope this is clearer.

That is clearer, thanks!

I think that it is a hope­less en­deav­our to aim for such pre­cise lan­guage in these dis­cus­sions at this point in time, be­cause I es­ti­mate that it would take a lu­dicrous amount of ad­di­tional in­tel­lec­tual labour to reach that level of rigour. It’s too high of a tar­get.

Well, it’s already pos­si­ble to write code that ex­hibits some of the failure modes AI pes­simists are wor­ried about. If dis­cus­sions about AI safety switched from trad­ing sen­tences to trad­ing toy AI pro­grams, which op­er­ate on grid­wor­lds and such, I sus­pect the clar­ity of dis­course would im­prove.

I might post some scraps of ar­gu­ments on my blog soon­ish, but those posts won’t be well-writ­ten and I don’t ex­pect any­one to re­ally read those.

The “lan­guage” sec­tion is the strongest IMO. But it feels like “self-driv­ing” and “pre-driven” cars prob­a­bly ex­ist on some kind of con­tinuum. How well do the sys­tem’s clas­sifi­ca­tion al­gorithms gen­er­al­ize? To what de­gree does the sys­tem solve the “dis­tri­bu­tion shift” prob­lem and tell a hu­man op­er­a­tor to take con­trol in cir­cum­stances that the car isn’t pre­pared for? (You call these cir­cum­stances “un­fore­seen”, but what about a car that at­tempts to fore­see likely situ­a­tions it doesn’t know what to do in and ask a hu­man for in­put in ad­vance?) What ex­per­i­ment would let me de­ter­mine whether a par­tic­u­lar car is self-driv­ing or pre-driven? What falsifi­able pre­dic­tions, if any, are you mak­ing about the fu­ture of self-driv­ing cars?

I was con­fused by this sen­tence: “The sec­ond pat­tern is su­pe­rior by wide mar­gin when it comes to pre­sent-day soft­ware”.

I think leaky ab­strac­tions are a big prob­lem in dis­cus­sions of AI risk. You’re doubtless fa­mil­iar with the pro­cess by which you trans­late a vague idea in your head into com­puter code. I think too many AI safety dis­cus­sions are hap­pen­ing at the “vague idea” level, and more dis­cus­sions should be hap­pen­ing at the code level or the “English that’s pre­cise enough to trans­late into code” level, which seems like what you’re grasp­ing at here. I think if you spent more time work­ing on your on­tol­ogy and the clar­ity of your thought, the lan­guage sec­tion could be re­ally strong.

(Any post which ar­gues the the­sis “AI safety is eas­ily solv­able” is both a post that ar­gues for de-pri­ori­tiz­ing AI safety and a post that is, in a sense, at­tempt­ing to solve AI safety. I think posts like these are valuable; “AI safety has this spe­cific easy solu­tion” isn’t as within the Over­ton win­dow of the com­mu­nity de­voted to work­ing on AI safety as I would like it to be. Even if the best solu­tion ends up be­ing com­plex, I think in-depth dis­cus­sion of why easy solu­tions won’t work has been ne­glected.)

Re: the an­chor­ing sec­tion, pretty sure it is well doc­u­mented by psy­chol­o­gists that hu­mans are over­con­fi­dent in their prob­a­bil­is­tic judge­ments. Even if hu­mans tend to an­chor on 50% prob­a­bil­ity and ad­just from there, it seems this isn’t enough to counter our over­con­fi­dence bias. Re­gard­ing the “Dis­count­ing the fu­ture” sec­tion of your post, see the “Mul­ti­ple-Stage Fal­lacy”. If a su­per­in­tel­li­gent FAI gets cre­ated, it can likely make hu­man­ity’s ex­tinc­tion prob­a­bil­ity al­most ar­bi­trar­ily low through suffi­cient para­noia. Re­gard­ing AI ac­ci­dents go­ing “re­ally re­ally wrong”, see the in­stru­men­talcon­ver­gence the­sis. And AI safety work could be helpful even if coun­ter­mea­sures aren’t im­ple­mented uni­ver­sally, through cre­ation of a friendly sin­gle­ton.

Pre­sum­ably the pro­gram­mer will make some effort to em­bed the right set of val­ues in the AI. If this is an easy task, doom is prob­a­bly not the de­fault out­come.

AI pes­simists have ar­gued hu­man val­ues will be difficult to com­mu­ni­cate due to their com­plex­ity. But as AI ca­pa­bil­ities im­prove, AI sys­tems get bet­ter at learn­ing com­plex things.

Both the in­stru­men­tal con­ver­gence the­sis and the com­plex­ity of value the­sis are key parts of the ar­gu­ment for AI pes­simism as it’s com­monly pre­sented. Are you claiming that they aren’t ac­tu­ally nec­es­sary for the ar­gu­ment to be com­pel­ling? (If so, why were they in­cluded in the first place? This sounds a bit like jus­tifi­ca­tion drift.)

the origi­nal texts are very clear that the mas­sive jump in AI ca­pa­bil­ity is sup­posed to come from re­cur­sive self-im­prove­ment, i.e. the AI helping to do AI research

...be­cause that AI re­search is use­ful for some other goal the AI has, such as max­i­miz­ing pa­per­clips. See the in­stru­men­tal con­ver­gence the­sis.

At any rate, though, what does it mat­ter whether the goal is put in af­ter the ca­pa­bil­ity growth, or be­fore/​dur­ing? Ob­vi­ously, it mat­ters, but it doesn’t mat­ter for pur­poses of eval­u­at­ing the pri­or­ity of AI safety work, since in both cases the po­ten­tial for ac­ci­den­tal catas­tro­phe ex­ists.

The ar­gu­ment for doom by de­fault seems to rest on a de­fault mi­s­un­der­stand­ing of hu­man val­ues as the pro­gram­mer at­tempts to com­mu­ni­cate them to the AI. If ca­pa­bil­ity growth comes be­fore a goal is granted, it seems less likely that mi­s­un­der­stand­ing will oc­cur.

Great idea! I don’t think mass re­quests are the way to go, though. I’ll bet if some­one like Peter Singer, Will MacAskill, or Toby Ord sent them a pro­posal to write an ar­ti­cle about EA, they’d ac­cept. I sent Will a Face­book mes­sage to ask him what he thinks.

I think more peo­ple should be study­ing statis­tics, ma­chine learn­ing, and data sci­ence, es­pe­cially Bayesian meth­ods and causal in­fer­ence. Not only do these skills offer a chance to con­tribute to AI safety, they’re also crit­i­cal for eval­u­at­ing sci­en­tific pa­pers (im­por­tant for any field given the repli­ca­tion crisis), do­ing pre­dic­tive mod­el­ing, and gen­er­ally think­ing in a data-driven and ev­i­dence-based way. Math is ap­par­ently 80k’s #1 recom­men­da­tion, but when I was a stu­dent, I went to an event where math ma­jors talked about their ex­pe­riences in in­dus­try. Most of them said they didn’t use the math they learned much and they wish they had stud­ied more statis­tics. So I would sug­gest ap­plied math with a statis­tics em­pha­sis.

If we’re choos­ing be­tween try­ing to im­prove Vox vs try­ing to dis­credit Vox, I think EA goals are served bet­ter by the former.

Tractabil­ity mat­ters. Scott Alexan­der has been cri­tiquing Vox for years. It might be that im­prov­ing Vox is a less tractable goal than get­ting EAs to share their ar­ti­cles less.

they went out on a limb to hire Piper, and they’ve sac­ri­ficed some read­er­ship to main­tain EA fidelity.

My un­der­stand­ing is that Fu­ture Perfect is funded by the Rock­efel­ler Foun­da­tion. Without know­ing the terms of their fund­ing, I think it’s hard to as­cribe ei­ther virtue or vice to Vox. For ex­am­ple, if the Rock­efel­ler Foun­da­tion is pay­ing them per con­tent item in the “Fu­ture Perfect” ver­ti­cal, I could as­cribe vice to Vox by say­ing that they are churn­ing out sub­par EA con­tent in or­der to im­prove their bot­tom line.

This is an in­ter­est­ing es­say. My think­ing is that “coal­i­tion norms”, un­der which poli­tics op­er­ate, trade off in­stru­men­tal ra­tio­nal­ity against epistemic ra­tio­nal­ity. I can ar­gue that it’s morally cor­rect from a con­se­quen­tial­ist point of view to tell a lie in or­der to get my fa­vorite poli­ti­cian elected so they will pass some crit­i­cal policy. But this is a Faus­tian bar­gain in the long run, be­cause it sac­ri­fices the episte­mol­ogy of the group, and causes the peo­ple who have the best ar­gu­ments against the group’s think­ing to leave in dis­gust or never join in the first place.

I’m not say­ing EAs shouldn’t join poli­ti­cal coal­i­tions. But I feel like we’d be sac­ri­fic­ing a lot if the EA move­ment be­gan slid­ing to­ward coal­i­tion norms. If you think some coal­i­tion is the best one, you can go off and work with that coal­i­tion. Or if you don’t like any of the ex­ist­ing ones, cre­ate one of your own, or maybe even join one & try to im­prove it from the in­side.

But if we ac­tu­ally want EA to go main­stream, we can’t rely on econ­blog­gers and think-tanks to reach most peo­ple. We need eas­ier ex­pla­na­tions, and I think Vox pro­vides that well.

Is “tak­ing EA main­stream” the best thing for Fu­ture Perfect to try & ac­com­plish? Our goal as a move­ment is not to max­i­mize the peo­ple of num­ber who have the “EA” la­bel. See Good­hart’s Law. Our goal is to do the most good. If we gar­ble the ideas or episte­mol­ogy of EA in an effort to max­i­mize the num­ber of peo­ple who have the “EA” self-la­bel, this seems like it’s po­ten­tially an ex­am­ple of Good­hart’s Law.

In­stead of “tak­ing EA main­stream”, how about “spread memes to Vox’s au­di­ence that will cause peo­ple in that au­di­ence to have a greater pos­i­tive im­pact on the world”?

I don’t have stats, it’s just some­thing I hear from ve­g­ans when I sug­gest an or­ga­ni­za­tion to provide welfare stan­dards for meat providers. They say it has been tried be­fore and the or­ga­ni­za­tion always gets co-opted by the in­dus­try. I’m ac­tu­ally kinda skep­ti­cal.

If you work as an agri­cul­tural in­spec­tor and err on the side of mak­ing recom­men­da­tions which hap­pen to im­prove an­i­mal welfare, that seems like it could be high-im­pact. Also: An ar­gu­ment I hear from ve­g­ans is that we can’t have happy meat be­cause any or­ga­ni­za­tion which pur­ports to en­force some stan­dard of an­i­mal welfare will es­sen­tially get bribed by fac­tory farms. If this is true, a way to ad­dress it would be to fun­nel un-brib­able peo­ple with a pas­sion for an­i­mal welfare into those roles.

WRT earn­ing to give, the US Bureau of La­bor Statis­tics main­tains an Oc­cu­pa­tional Out­look Hand­book with info on wages and job growth for loads of differ­ent jobs. Air traf­fic con­trol­ler looks pretty good, al­though the BLS seems to think you typ­i­cally need a 2-year de­gree, so maybe it doesn’t count as “vo­ca­tional”.

I also think it is worth speci­fi­cally think­ing in terms of jobs which aren’t on the radar of other peo­ple, be­cause lower sup­ply is go­ing to mean a higher salary. Th­ese red­ditthreads might be worth check­ing out. Fi­nally, it might be worth­while to try to get ac­cess to pub­li­cly available salary data in or­der to de­ter­mine which mu­ni­ci­pal­ities pay a lot of money for jobs like be­ing a po­lice officer. (You prob­a­bly also want to take a care­ful look at the pen­sion plan in that mu­ni­ci­pal­ity to en­sure that it’s on solid ground fis­cally.) BTW, Tyler Cowen likes to ar­gue that hiring more cops and im­pris­on­ing fewer peo­ple would be good for the USA on both crime re­duc­tion and hu­man­i­tar­ian grounds; here is one pre­sen­ta­tion of the ar­gu­ment.

To me it seems ex­traor­di­nar­ily un­likely that any agent ca­pa­ble of perform­ing all these tasks with a high de­gree of profi­ciency would si­mul­ta­neously stand firm in its con­vic­tion that the only goal it had rea­sons to pur­sue was till­ing the uni­verse with pa­per­clips.

Seems a lit­tle an­thro­po­mor­phic. A pos­si­bly less an­thro­po­mor­phic ar­gu­ment: If we pos­sess the al­gorithms re­quired to con­struct an agent that’s ca­pa­ble of achiev­ing de­ci­sive strate­gic ad­van­tage, we can also ap­ply those al­gorithms to pon­der­ing moral dilem­mas etc. and use those al­gorithms to con­struct the agent’s value func­tion.