Improving the future by influencing actors’ benevolence, intelligence, and power

This post was writ­ten for Con­ver­gence Anal­y­sis. Some of the ideas in the post are similar to and/​or draw in­fluence from ear­lier ideas and work.[1]

Overview

This post ar­gues that one use­ful way to come up with, and as­sess the ex­pected value of, ac­tions to im­prove the long-term fu­ture is to con­sider how “benev­olent”, “in­tel­li­gent”, and “pow­er­ful” var­i­ous ac­tors are, and how var­i­ous ac­tions could af­fect those ac­tors’ benev­olence, in­tel­li­gence, and power. We first ex­plain what we mean by those terms. We then out­line nine im­pli­ca­tions of our benev­olence, in­tel­li­gence, power (BIP) frame­work, give ex­am­ples of ac­tions these im­pli­ca­tions push in favour of or against, and vi­su­ally rep­re­sent these im­pli­ca­tions.

Th­ese im­pli­ca­tions in­clude that it’s likely good to:

  1. In­crease ac­tors’ benev­olence.

  2. In­crease the in­tel­li­gence of ac­tors who are suffi­ciently benevolent

  3. In­crease the power of ac­tors who are suffi­ciently benev­olent and intelligent

And that it may be bad to:

  1. In­crease the in­tel­li­gence of ac­tors who aren’t suffi­ciently benevolent

  2. In­crease the power of ac­tors who aren’t suffi­ciently benev­olent and intelligent

For ex­am­ple, de­pend­ing on the de­tails, it may be:

  • Good to fund moral philos­o­phy research

  • Good to provide in­for­ma­tion about emerg­ing tech­nolo­gies to poli­cy­mak­ers who want to benefit all coun­tries and generations

  • Bad to provide such in­for­ma­tion to some­what na­tion­al­is­tic or short-sighted policymakers

  • Good to help poli­cy­mak­ers gain poli­ti­cal po­si­tions and in­fluence if they (a) want to benefit all coun­tries and gen­er­a­tions and (b) un­der­stand the ar­gu­ments for miti­gat­ing risks from emerg­ing technologies

  • Bad to help poli­cy­mak­ers gain poli­ti­cal po­si­tions and in­fluence if they (a) want to benefit all coun­tries and gen­er­a­tions but (b) think that that re­quires ac­cel­er­at­ing tech­nolog­i­cal progress as much as possible

An ad­di­tional im­pli­ca­tion of the BIP frame­work is that the good­ness or bad­ness of an in­crease in an ac­tor’s benev­olence, in­tel­li­gence, or power may of­ten be larger the higher their lev­els of the other two fac­tors are.

(Through­out this post, we use the term “ac­tors” to mean a per­son, a group of peo­ple of any size, hu­man­ity as a whole, or an “in­sti­tu­tion” such as a gov­ern­ment, com­pany, or non­profit.[2])

Introduction

Let’s say you want to im­prove the ex­pected value of the long-term fu­ture, such as by re­duc­ing ex­is­ten­tial risks. Which of the ac­tions available to you will best achieve that goal? Which of those ac­tions have ma­jor down­side risks, and might even make the fu­ture worse?

An­swer­ing these ques­tions pre­cisely and con­fi­dently would re­quire long-term pre­dic­tions about com­plex and un­prece­dented events. Luck­ily, var­i­ous frame­works, heuris­tics, and prox­ies have been pro­posed that help us at least get some trac­tion on these ques­tions, so we can walk a fruit­ful mid­dle path be­tween anal­y­sis paral­y­sis and ran­dom ac­tion. For ex­am­ple, we might use the im­por­tance, tractabil­ity, and ne­glect­ed­ness (ITN) frame­work, or adopt the heuris­tic of pur­su­ing differ­en­tial progress.

This post pro­vides an­other frame­work that we be­lieve will of­ten be use­ful for com­ing up with, and as­sess­ing the ex­pected value of, ac­tions to im­prove the long-term fu­ture: the benev­olence, in­tel­li­gence, power (BIP) frame­work.

What this frame­work is use­ful for

We think it will of­ten be use­ful to use the BIP frame­work if one of your ma­jor goals in gen­eral is im­prov­ing the ex­pected value of the long-term fu­ture (see also MacAskill), and ei­ther of the fol­low­ing is true:

  1. You’re try­ing to come up with ac­tions to im­prove the long-term fu­ture.

  2. You’re con­sid­er­ing tak­ing an ac­tion which isn’t speci­fi­cally in­tended to im­prove the long-term fu­ture, but which is rel­a­tively “ma­jor” and “non-rou­tine”, such that it could be worth as­sess­ing that ac­tion’s im­pacts on the fu­ture anyway

For ex­am­ple, the BIP frame­work may be worth us­ing when com­ing up with, or con­sid­er­ing tak­ing, ac­tions such as:

  • Start­ing a new organisation

  • Set­ting up a work­shop for AI researchers

  • Choos­ing a ca­reer path

  • Donat­ing over a thou­sand dollars

  • Writ­ing an ar­ti­cle that you ex­pect to get a sub­stan­tial amount of attention

The BIP frame­work could help you recog­nise key con­sid­er­a­tions, de­cide whether to take the ac­tion, and de­cide pre­cisely how to ex­e­cute the ac­tion (e.g., should you tar­get the work­shop at AI re­searchers in gen­eral, or at AI safety re­searchers in par­tic­u­lar?).

Also note that the BIP frame­work is best at cap­tur­ing im­pacts that oc­cur via effects on other ac­tors’ be­havi­ours. For ex­am­ple, the frame­work will bet­ter cap­ture the im­pacts of (a) writ­ing a blog post that in­fluences what biotech re­searchers work on and how they do so than (b) ac­tu­ally de­sign­ing a vac­cine plat­form one­self.[3] But this seems only a minor limi­ta­tion, as it seems most ac­tions to im­prove the long-term fu­ture would do so largely via af­fect­ing how other ac­tors be­have.

Fi­nally, note that BIP is just one frame­work. It will of­ten be use­ful to ad­di­tion­ally or in­stead use other frame­works, heuris­tics, or prox­ies, and/​or a more de­tailed anal­y­sis of the speci­fics of the situ­a­tion at hand. For ex­am­ple, even if the BIP frame­work sug­gests ac­tion X would, in the ab­stract, be bet­ter than ac­tion Y, it’s pos­si­ble your com­par­a­tive ad­van­tage would mean it would be bet­ter for you to do ac­tion Y.

The three factors

This sec­tion will clar­ify what we mean, in the con­text of the BIP frame­work, by the terms benev­olence, in­tel­li­gence, and power. Three caveats first:

  • Our us­age of these terms differs some­what from how they are used in ev­ery­day lan­guage.

  • The three fac­tors are also some­what fuzzy, they over­lap and in­ter­act in some ways,[4] and each fac­tor con­tains mul­ti­ple, mean­ingfully differ­ent sub-com­po­nents.

  • The pur­pose of this frame­work is not to judge peo­ple, or as­sess the “value” of peo­ple, but rather to aid in pri­ori­ti­sa­tion and in miti­gat­ing down­side risks, from the per­spec­tive of try­ing to im­prove the long-term fu­ture. As the Cen­tre for Effec­tive Altru­ism notes in re­la­tion to a similar model, vari­a­tion in the fac­tors they fo­cus on:

of­ten rests on things out­side of peo­ple’s con­trol. Luck, life cir­cum­stance, and ex­ist­ing skills may make a big differ­ence to how much some­one can offer, so that even peo­ple who care very much can end up hav­ing very differ­ent im­pacts. This is un­com­fortable, be­cause it pushes against egal­i­tar­ian norms that we value. [...] We also do not think that these ideas should be used to de­value or dis­miss cer­tain peo­ple, or that they should be used to idol­ize oth­ers. The rea­son we are con­sid­er­ing this ques­tion is to help us un­der­stand how we should pri­ori­tize our re­sources in car­ry­ing out our pro­grams, not to judge peo­ple.[5]

Benevolence

By benev­olence, we es­sen­tially mean how well an ac­tor’s moral be­liefs or val­ues al­ign with the goal of im­prov­ing the ex­pected value of the long-term fu­ture. For ex­am­ple, an ac­tor is more “benev­olent” if they value al­tru­ism in ad­di­tion to self-in­ter­est, or if they value fu­ture peo­ple in ad­di­tion to presently liv­ing peo­ple.

Given moral and em­piri­cal un­cer­tainty, it can of course be difficult to be con­fi­dent about how well an ac­tor’s moral be­liefs or val­ues al­ign with im­prov­ing the long-term fu­ture. For ex­am­ple, how much should an ac­tor value hap­piness, suffer­ing re­duc­tion, prefer­ence satis­fac­tion, and other things?[6] But we think differ­ences in benev­olence can some­times be rel­a­tively clear and highly im­por­tant.[7] To illus­trate, here’s a list of ac­tors in ap­prox­i­mately de­scend­ing or­der of benev­olence:

  1. Some­one purely mo­ti­vated by com­pletely im­par­tial al­tru­ism (in­clud­ing con­sid­er­ing welfare and suffer­ing to be just as morally sig­nifi­cant no mat­ter when or where it oc­curs)

  2. Some­one mostly mo­ti­vated by mostly im­par­tial altruism

  3. A “typ­i­cal per­son”, who acts partly out of self-in­ter­est and partly based on some­what al­tru­is­tic “com­mon sense” values

  4. An un­usu­ally mean or self-in­ter­ested person

  5. Ter­ror­ists, dic­ta­tors, and ac­tively sadis­tic peo­ple[8]

We do not in­clude as part of “benev­olence” the qual­ity of an ac­tor’s em­piri­cal be­liefs or more “con­crete” val­ues or goals. For ex­am­ple, one per­son may fo­cus on sup­port­ing ex­is­ten­tial risk re­duc­tion (ei­ther di­rectly or via dona­tions), while an­other fo­cuses on sup­port­ing ad­vo­cacy against nu­clear power gen­er­a­tion. If both ac­tors are similarly mo­ti­vated by do­ing what they sincerely be­lieve will benefit fu­ture gen­er­a­tions, they may have the same level of benev­olence; they are both try­ing to ad­vance “good” moral be­liefs or val­ues. How­ever, one per­son has de­vel­oped a bet­ter plan for helping; they have iden­ti­fied a bet­ter path to­wards those “good” moral be­liefs or val­ues.[9][10] This likely re­flects a differ­ence in the ac­tors’ lev­els of “in­tel­li­gence”, in our sense of the term, which we turn to now.

Intelligence

By in­tel­li­gence, we es­sen­tially mean any in­tel­lec­tual abil­ities or em­piri­cal be­liefs that would help an ac­tor make and ex­e­cute plans that are al­igned with the ac­tor’s moral be­liefs or val­ues. Thus, this in­cludes things like knowl­edge of the world, prob­lem-solv­ing skills, abil­ity to learn and adapt, (epistemic) ra­tio­nal­ity, fore­sight or fore­cast­ing abil­ities, abil­ity to co­or­di­nate with oth­ers, etc.[11][12] For ex­am­ple, two ac­tors who both aim to benefit fu­ture gen­er­a­tions may differ in whether their plan for do­ing so in­volves sup­port­ing ex­is­ten­tial risk re­duc­tion or sup­port­ing ad­vo­cacy against nu­clear power, and this may re­sult from differ­ences in (among other things):

  • how much (mis)in­for­ma­tion they’ve re­ceived about ex­is­ten­tial risks and in­ter­ven­tions to miti­gate them, and about the benefits and harms of nu­clear power

  • how ca­pa­ble they are of fol­low­ing com­plex and tech­ni­cal arguments

  • how much they tend to crit­i­cally re­flect on ar­gu­ments they’re pre­sented with

That was an ex­am­ple where more “in­tel­li­gence” helped an ac­tor make high-level plans that were bet­ter al­igned with the ac­tor’s moral be­liefs or val­ues. In­tel­li­gence can also aid in mak­ing bet­ter fine-grained, spe­cific plans, or in ex­e­cut­ing plans. For ex­am­ple, if two ac­tors both sup­port ex­is­ten­tial risk re­duc­tion, the more “in­tel­li­gent” one may be more likely to:

  • iden­tify what the ma­jor risks and in­ter­ven­tion op­tions are (rather than, say, fo­cus­ing solely and un­crit­i­cally on as­ter­oids)

  • form effec­tive strate­gies and spe­cific im­ple­men­ta­tion plans (rather than, say, hav­ing no real idea how to start on im­prov­ing nu­clear se­cu­rity)

  • pre­dict, re­duce, and/​or mon­i­tor down­side risks (rather than, say, widely broad­cast­ing all pos­si­ble dan­ger­ous tech­nolo­gies or ap­pli­ca­tions that they re­al­ise are pos­si­ble)

  • ar­rive at valuable, novel insights

But in­tel­li­gence is not the only fac­tor in an ac­tor’s ca­pa­bil­ity to ex­e­cute its plans; an­other key fac­tor is their “power”.

Power

By power, we es­sen­tially mean any non-in­tel­lec­tual abil­ities or re­sources that would help an ac­tor ex­e­cute its plans (e.g., wealth, poli­ti­cal power, per­sua­sive abil­ities, or phys­i­cal force). For ex­am­ple, if two ac­tors both sup­port ex­is­ten­tial risk re­duc­tion, the more “pow­er­ful” one may be more able to ac­tu­ally fund pro­jects, ac­tu­ally build sup­port for preferred poli­cies, or ac­tu­ally in­crease the num­ber of peo­ple work­ing on these is­sues. Like­wise, if two ac­tors both have malev­olent goals (e.g., aim to be­come dic­ta­tors) or both have benev­olent goals but very mis­guided plans (e.g., aim to ad­vo­cate against nu­clear power), the more “pow­er­ful” ac­tor may be more able to ac­tu­ally set their plans in mo­tion, and may there­fore cause more harm.[13]

In­tel­li­gence also aids in ex­e­cut­ing plans, and to that ex­tent both “in­tel­li­gence” and “power” could be col­lapsed to­gether as “ca­pa­bil­ity”. But there’s a key dis­tinc­tion be­tween in­tel­li­gence and power: differ­ences in in­tel­li­gence are more likely to also af­fect what plans are cho­sen, rather than merely af­fect­ing how effec­tively plans are car­ried out. Thus, as we dis­cuss more be­low, it is more ro­bustly valuable to in­crease ac­tors’ in­tel­li­gence than their power, since in­creas­ing a mis­guided but benev­olent ac­tor’s in­tel­li­gence may help them course-cor­rect, whereas in­creas­ing their power may just lead to them trav­el­ling more quickly along their net nega­tive path.[14]

An anal­ogy to illus­trate these factors

We’ll use a quick anal­ogy to fur­ther clar­ify these three fac­tors, and to set the scene for our dis­cus­sion of the im­pli­ca­tions of the BIP frame­work.

Imag­ine you’re the leader of a group of peo­ple on some is­land, and that all that re­ally, truly mat­ters is that you and your group make it to a lus­cious for­est, and avoid a pit of lava.

In this sce­nario:

  • If you’re quite benev­olent, you’ll want to help your group get to the for­est.

  • If in­stead you’re less benev­olent, you might not care much about where you and your group get to, or just care about whether you get to the for­est, or even want your group to end up in the lava.

  • If you’re quite benev­olent and in­tel­li­gent, you’ll have a good idea of where the for­est and the lava are, come up with a good path to get to the for­est while avoid­ing the lava, and have in­tel­lec­tual ca­pac­i­ties that’ll help with solv­ing prob­lems that arise as you travel that path.

  • If in­stead you’re quite benev­olent but less in­tel­li­gent, you might have mis­taken be­liefs about where the for­est or lava are (per­haps even be­liev­ing the for­est is where the lava ac­tu­ally is, and thus lead­ing your group there, de­spite the best of in­ten­tions). Or you might come up with a bad path to the for­est (per­haps even one that would take you through the lava). Or you might lack the in­tel­lec­tual ca­pac­i­ties re­quired to solve prob­lems along the way

  • If you’re quite benev­olent, in­tel­li­gent, and pow­er­ful, you’ll also have the charisma to con­vince your group to fol­low the good path you’ve cho­sen, and the strength, en­durance, and food sup­plies to phys­i­cally walk that path and sup­port oth­ers in do­ing so.

  • If in­stead you’re quite benev­olent and in­tel­li­gent but less pow­er­ful, you might lack those ca­pac­i­ties and re­sources, and there­fore your group might not man­age to ac­tu­ally reach the end of the good path you’ve cho­sen.

Im­pli­ca­tions and examples

We’ll now out­line nine im­pli­ca­tions of the BIP frame­work. Th­ese can be seen as heuris­tics to con­sider when com­ing up with, and as­sess­ing the ex­pected value of, ac­tions to im­prove the long-term fu­ture, or “ma­jor” and “non-rou­tine” ac­tions in gen­eral. We’ll also give ex­am­ples of ac­tions these heuris­tics may push in favour of or against. Many of these im­pli­ca­tions and ex­am­ples should be fairly in­tu­itive, but we think there’s value in lay­ing them out ex­plic­itly and con­nect­ing them into one broader frame­work.

Note that we don’t mean to im­ply that these heuris­tics alone can de­ter­mine with cer­tainty whether an ac­tion is valuable, nor whether it should be pri­ori­tised rel­a­tive to other valuable ac­tions. That would re­quire also con­sid­er­ing other frame­works and heuris­tics, such as how ne­glected and tractable the ac­tion is.

In­fluenc­ing benevolence

1. From the per­spec­tive of im­prov­ing the long-term fu­ture, it will typ­i­cally be valuable to in­crease an ac­tor’s “benev­olence”: to cause an ac­tor’s moral be­liefs or val­ues to bet­ter al­ign with the goal of im­prov­ing the ex­pected value of the long-term fu­ture.[15]

Ex­am­ples of in­ter­ven­tions that might lead to in­creases in benev­olence in­clude EA move­ment-build­ing, re­search into moral philos­o­phy or moral psy­chol­ogy, cre­at­ing ma­te­ri­als that help peo­ple learn about and re­flect on ar­gu­ments for and against differ­ent eth­i­cal views, and pro­vid­ing fund­ing or train­ing for peo­ple who im­ple­ment those kinds of in­ter­ven­tions. Althaus and Bau­mann’s dis­cus­sion of in­ter­ven­tions to re­duce (or screen for) malev­olence is also rele­vant.

2. That first im­pli­ca­tion seems ro­bust to differ­ences in how “in­tel­li­gent” and “pow­er­ful” the ac­tor is. That is, it seems in­creas­ing an ac­tor’s benev­olence will very rarely de­crease the value of the fu­ture, even if the ac­tor’s lev­els of in­tel­li­gence and power are low.

3. It will be more valuable to in­crease the benev­olence of ac­tors who are more in­tel­li­gent and/​or more pow­er­ful. For ex­am­ple, it’s more valuable to cause a skil­led prob­lem-solver, bio­eng­ineer­ing PhD stu­dent, se­nior civil ser­vant, or mil­lion­aire to be highly mo­ti­vated by im­par­tial al­tru­ism than to cause the same change in some­one with fewer in­tel­lec­tual and non-in­tel­lec­tual abil­ities and re­sources. This is be­cause how good an ac­tor’s moral be­liefs or val­ues are is es­pe­cially im­por­tant if the ac­tor is very good at mak­ing and ex­e­cut­ing plans al­igned with those moral be­liefs or val­ues.

This sug­gests that, if one is con­sid­er­ing tak­ing an ac­tion to im­prove ac­tors’ benev­olence, it could be worth try­ing to tar­get this to­wards more in­tel­li­gent and/​or more pow­er­ful ac­tors. For ex­am­ple, this could push in favour of fo­cus­ing EA move­ment-build­ing some­what on tal­ented grad­u­ate stu­dents, suc­cess­ful pro­fes­sion­als, etc. (Though there are also con­sid­er­a­tions that push in the op­po­site di­rec­tion, such as the value of re­duc­ing ac­tual or per­ceived elitism within EA.)

In­fluenc­ing intelligence

4. It will of­ten (but not always) be valuable to in­crease an ac­tor’s “in­tel­li­gence”, be­cause this could:

  • Help an ac­tor which already has good plans make even bet­ter plans, and/​or ex­e­cute their plans more effectively

  • Help an ac­tor with benev­olent val­ues but mis­guided and detri­men­tal plans re­al­ise the ways in which those plans are mis­guided and detri­men­tal, and thus make bet­ter plans

Ex­am­ples of in­ter­ven­tions that might lead to in­creases in in­tel­li­gence in­clude fund­ing schol­ar­ships, pro­vid­ing effec­tive ra­tio­nal­ity train­ing, and pro­vid­ing ma­te­ri­als that help peo­ple “get up to speed” on ar­eas like AI or biotech­nol­ogy.

5. But it could be harm­ful (from the per­spec­tive of im­prov­ing the long-term fu­ture) to in­crease the “in­tel­li­gence” of ac­tors which are be­low some “thresh­old” level of benev­olence. This is be­cause that could help those ac­tors more effec­tively make and ex­e­cute plans that are not so much “mis­guided” as “well-guided to­wards bad goals”. (See also Althaus and Bau­mann.)

For a rel­a­tively ob­vi­ous ex­am­ple, it seems harm­ful to help ter­ror­ists, au­thor­i­tar­i­ans, and cer­tain mil­i­taries bet­ter un­der­stand var­i­ous as­pects of biotech­nol­ogy. For a more spec­u­la­tive ex­am­ple, if ac­cel­er­at­ing AI de­vel­op­ment could in­crease ex­is­ten­tial risk (though see also Beck­stead), then fund­ing schol­ar­ships for AI re­searchers in gen­eral or pro­vid­ing ma­te­ri­als on AI to the pub­lic at large might de­crease the value of the long-term fu­ture.

Deter­min­ing pre­cisely what the rele­vant “thresh­old” level of benev­olence would be is not a triv­ial mat­ter, but we think even just recog­nis­ing that such a thresh­old likely ex­ists may be use­ful. The thresh­old would also de­pend on the pre­cise type of in­tel­li­gence im­prove­ment that would oc­cur. For ex­am­ple, the same au­thor­i­tar­i­ans or mil­i­taries may be “suffi­ciently” benev­olent (e.g., just en­tirely self-in­ter­ested, rather than ac­tively sadis­tic) that im­prov­ing their un­der­stand­ing of global pri­ori­ties re­search is safe, even if im­prov­ing their un­der­stand­ing of biotech is not.

6. More gen­er­ally, in­creases in an ac­tor’s “in­tel­li­gence” may tend to be:

  • More valuable the more benev­olent the ac­tor is

  • More valuable the more pow­er­ful the ac­tor is, as long as the ac­tor meets some “thresh­old” level of benevolence

  • More harm­ful the more pow­er­ful the ac­tor is, if that thresh­old level of benev­olence is not met

There­fore, for ex­am­ple:

  • It may be worth tar­get­ing schol­ar­ship fund­ing, effec­tive ra­tio­nal­ity train­ing, or ed­u­ca­tional ma­te­ri­als to­wards peo­ple who’ve shown in­di­ca­tions of be­ing highly mo­ti­vated by im­par­tial al­tru­ism (e.g., in cover let­ters or prior ac­tivi­ties).

  • It may be more valuable to provide wealthy, well-con­nected, and/​or charis­matic peo­ple with im­por­tant facts, ar­gu­ments, or train­ing rele­vant to im­prov­ing the fu­ture, rather than to provide those same things to other peo­ple.

  • If a per­son or group seems like they might have harm­ful goals, it seems worth be­ing es­pe­cially care­ful about helping them get schol­ar­ships, train­ing, etc. if they’re also very wealthy, well-con­nected, etc.

In­fluenc­ing power

7. It will some­times be valuable to in­crease an ac­tor’s “power”, be­cause this could help an ac­tor which already has good plans ex­e­cute them more effec­tively. Ex­am­ples of in­ter­ven­tions that would likely lead to in­creases in power in­clude helping a per­son in­vest well or find a high-pay­ing job, boost­ing na­tional or global eco­nomic growth (this boosts the power of many ac­tors, in­clud­ing hu­man­ity as a whole), helping a per­son net­work, or pro­vid­ing tips or train­ing on pub­lic speak­ing.

8. But it could be harm­ful (from the per­spec­tive of im­prov­ing the long-term fu­ture) to in­crease the “power” of ac­tors which are be­low some “thresh­old” com­bi­na­tion of benev­olence and in­tel­li­gence. This is be­cause:

  • That could help those ac­tors more effec­tively ex­e­cute plans that are mis­guided, or that are “well-guided to­wards bad goals”

  • That wouldn’t help well-in­ten­tioned but mis­guided ac­tors make bet­ter plans (whereas in­creases in in­tel­li­gence might)

This makes in­creas­ing an ac­tor’s power less ro­bustly pos­i­tive than in­creas­ing the ac­tor’s in­tel­li­gence, which is in turn less ro­bustly pos­i­tive than in­creas­ing their benev­olence.

For a rel­a­tively ob­vi­ous ex­am­ple, it seems harm­ful to help ter­ror­ists, au­thor­i­tar­i­ans, and cer­tain mil­i­taries gain wealth and poli­ti­cal in­fluence. For a more spec­u­la­tive ex­am­ple, if ac­cel­er­at­ing AI de­vel­op­ment would in­crease ex­is­ten­tial risk, then helping as­piring AI re­searchers or start-ups in gen­eral gain wealth and poli­ti­cal in­fluence might de­crease the value of the fu­ture.

As with the thresh­old level of benev­olence re­quired for an in­tel­li­gence in­crease to be benefi­cial, we don’t know pre­cisely what the re­quired thresh­old com­bi­na­tion of benev­olence and in­tel­li­gence is, and we ex­pect it will differ for differ­ent pre­cise types of power in­crease (e.g., in­creases in wealth vs in­creases in poli­ti­cal power).

9. More gen­er­ally, in­creases in an ac­tor’s “power” may tend to be:

  • More valuable the more benev­olent the ac­tor is

  • More valuable the more in­tel­li­gent the ac­tor is, as long as the ac­tor meets some “thresh­old” com­bi­na­tion of benev­olence and intelligence

  • More harm­ful the more in­tel­li­gent the ac­tor is, if that thresh­old com­bi­na­tion of benev­olence and in­tel­li­gence is not met

There­fore, for ex­am­ple:

  • It seems bet­ter to help peo­ple, com­pa­nies, or gov­ern­ments get money, con­nec­tions, in­fluence, etc., if their val­ues are more al­igned with the goal of im­prov­ing the long-term fu­ture.

  • It seems best to provide grant money to longter­mists whose ap­pli­ca­tions in­di­cate they have strong spe­cial­ist and gen­er­al­ist knowl­edge, good ra­tio­nal­ity and prob­lem-solv­ing skills, good aware­ness of down­side risks such as in­for­ma­tion haz­ards and how to avoid them, etc.

  • If a per­son or group seems like they might have harm­ful goals, it seems worth be­ing es­pe­cially care­ful about helping them get money, con­nec­tions, in­fluence, etc. if they also seem highly in­tel­li­gent.

Vi­su­al­is­ing the im­pli­ca­tions of the BIP framework

We could ap­prox­i­mately rep­re­sent these im­pli­ca­tions us­ing three-di­men­sional graphs, with benev­olence, in­tel­li­gence, and power on the axes, and higher ex­pected val­ues of the long-term fu­ture rep­re­sented by greener rather than red­der shades. To keep things sim­ple and easy to un­der­stand in a still image, we’ll in­stead provide a pair of two di­men­sional graphs: one show­ing benev­olence and in­tel­li­gence, and the other show­ing a “com­bi­na­tion of benev­olence and in­tel­li­gence” (which we will not try to pre­cisely define) and power. The im­pli­ca­tions are similar for each graph’s pair of di­men­sions. Thus, again for sim­plic­ity, we’ve used two graphs that are math­e­mat­i­cally iden­ti­cal to each other; they just have differ­ent la­bels.

We’ll also show a vec­tor field on each graph (see also our post on Us­ing vec­tor fields to vi­su­al­ise prefer­ences and make them con­sis­tent). That is, we will add ar­rows at each point whose di­rec­tion rep­re­sents which di­rec­tion it would be benefi­cial to move in that point, and whose size rep­re­sents how benefi­cial move­ment in that di­rec­tion would be.[16]

Here is the first graph:

This graph cap­tures the im­pli­ca­tions that:

  • In­creases in benev­olence seem un­likely to ever be harm­ful (the ar­rows never point to the left)

  • In­creases in in­tel­li­gence can be ei­ther harm­ful or benefi­cial, de­pend­ing on the ac­tor’s benev­olence (the ar­rows on the left point down, and those on the right point up)

  • Changes in benev­olence mat­ter more the more in­tel­li­gent the ac­tor is (the ar­rows that are higher up are larger)

This graph does not cap­ture the rele­vance of the ac­tor’s level of power. Our sec­ond graph cap­tures that, though it loses some of the above nu­ances by col­laps­ing benev­olence and in­tel­li­gence to­gether:

This sec­ond graph is math­e­mat­i­cally iden­ti­cal to the first graph, and has similar im­pli­ca­tions.

Conclusion

The BIP (benev­olence, in­tel­li­gence, power) frame­work can help with com­ing up with, or as­sess­ing the ex­pected value of, ac­tions to im­prove the long-term fu­ture (or “ma­jor” and “non-rou­tine” ac­tions in gen­eral). In par­tic­u­lar, it sug­gests nine spe­cific im­pli­ca­tions, which we out­lined above and which can be sum­marised as fol­lows:

  • It may be most ro­bustly or sub­stan­tially valuable to im­prove an ac­tor’s benev­olence, fol­lowed by im­prov­ing their in­tel­li­gence, fol­lowed by im­prov­ing their power (as­sum­ing those differ­ent im­prove­ments are equally tractable).

  • It could even be dan­ger­ous to im­prove an ac­tor’s in­tel­li­gence (if they’re be­low a cer­tain thresh­old level of benev­olence), or to im­prove their power (if they’re be­low a cer­tain thresh­old of benev­olence, or a cer­tain thresh­old com­bi­na­tion of benev­olence and in­tel­li­gence).

  • A given in­crease in any one of these three fac­tors may of­ten be more valuable the higher the ac­tor is on the other two fac­tors (as­sum­ing some thresh­old level of benev­olence, or of benev­olence and in­tel­li­gence, is met).

We hope this frame­work, and its as­so­ci­ated heuris­tics, can serve as one ad­di­tional, helpful tool in your efforts to benefit the long-term fu­ture. We’d also be ex­cited to see fu­ture work which uses this frame­work as one in­put in as­sess­ing the benefits and down­side risks of spe­cific in­ter­ven­tions (in­clud­ing but not limited to those in­ter­ven­tions briefly men­tioned in this post).

This post builds on ear­lier work by Justin Shov­e­lain and an ear­lier draft by Sara Hax­hia. I’m grate­ful to Justin, David Kristoffers­son, An­drés Gómez Emils­son, and Ella Park­in­son for helpful com­ments. We’re grate­ful also to An­drés for work on an ear­lier re­lated draft, and to Siebe Rozen­dal for helpful com­ments on an ear­lier re­lated draft. This does not im­ply these peo­ple’s en­dorse­ment of all of this post’s claims.


  1. In par­tic­u­lar, the fol­low­ing ideas and work:

    (We had not yet watched the Schu­bert and Le­ung talks when we de­vel­oped the ideas in this post.) ↩︎

  2. It’s worth not­ing that a group’s benev­olence, in­tel­li­gence, or power may not sim­ply be the sum or av­er­age of its mem­bers’ lev­els of those at­tributes. For ex­am­ple, to the ex­tent that a com­pany has “goals”, its pri­mary goals may not be the pri­mary goals of any of its di­rec­tors, em­ploy­ees, or stake­hold­ers. Re­lat­edly, it may be harder to as­sess or in­fluence the benev­olence, in­tel­li­gence, or power of a group than that of an in­di­vi­d­ual. ↩︎

  3. That said, the frame­work may still have the abil­ity to cap­ture more “di­rect” im­pacts, or to be adapted to do so. For ex­am­ple, one could frame vac­cine plat­forms as im­prov­ing the long-term fu­ture by re­duc­ing the lev­els of in­tel­li­gence and power that are re­quired to miti­gate biorisks, and in­creas­ing the lev­els of in­tel­li­gence and power is re­quired to cre­ate biorisks. One could even frame this as “in effect” in­creas­ing the in­tel­li­gence and/​or power of benev­olent ac­tors in the biorisk space, and “in effect” de­creas­ing the in­tel­li­gence and/​or power of malev­olent ac­tors in that space. ↩︎

  4. For ex­am­ple, in­creas­ing an ac­tor’s benev­olence and in­tel­li­gence might in­crease their pres­tige, one of two main forms of sta­tus (see The Se­cret of Our Suc­cess). Both forms of sta­tus would effec­tively in­crease an ac­tor’s power, as they would in­crease the ac­tor’s abil­ity to in­fluence oth­ers. ↩︎

  5. See also the sec­tion on Elitism vs. egal­i­tar­i­anism in that post. ↩︎

  6. Ar­guably, tak­ing moral un­cer­tainty se­ri­ously might it­self be one com­po­nent of benev­olence, such that more benev­olent ac­tors will put more effort into figur­ing out what moral be­liefs and val­ues they should have, and will be more will­ing to en­gage in moral trade. ↩︎

  7. It can also be hard to be con­fi­dent even about whether im­prov­ing the long-term fu­ture should be our fo­cus. But this post takes that as a start­ing as­sump­tion. ↩︎

  8. See also the dis­cus­sion of “Dark Te­trad” traits in Re­duc­ing long-term risks from malev­olent ac­tors. ↩︎

  9. This dis­tinc­tion be­tween “moral be­liefs or val­ues” and “plans” can per­haps also be thought of as a dis­tinc­tion be­tween “rel­a­tively high-level /​ ter­mi­nal /​ fun­da­men­tal goals or val­ues” and “rel­a­tively con­crete /​ in­stru­men­tal goals or val­ues”. ↩︎

  10. We use ad­vo­cacy against nu­clear power gen­er­a­tion merely as an ex­am­ple. Our pur­pose here is not re­ally to ar­gue against such ad­vo­cacy. If you be­lieve such ad­vo­cacy is net pos­i­tive and worth pri­ori­tis­ing, this shouldn’t stop you en­gag­ing with the core ideas of this post. For some back­ground on the topic, see Halstead. ↩︎

  11. Note that, given our loose defi­ni­tion of in­tel­li­gence, two ac­tors who gain the same in­tel­lec­tual abil­ity or em­piri­cal be­lief may gain differ­ent amounts of in­tel­li­gence, if that abil­ity or be­lief is more use­ful for one set of moral be­liefs or val­ues than for an­other. For ex­am­ple, knowl­edge about effec­tive al­tru­ism or global pri­ori­ties re­search may be more use­ful for some­one aiming to benefit the world than some­one aiming to get rich or be spite­ful, and thus may im­prove the former type of per­son’s in­tel­li­gence more. ↩︎

  12. Thus, what we mean by “in­tel­li­gence” will not be iden­ti­cal to what is mea­sured by IQ tests.

    See Legg and Hut­ter for a col­lec­tion of defi­ni­tions of in­tel­li­gence. We think our use of the word in­tel­li­gence lines up fairly well with most of these, such as Legg and Hut­ter’s own defi­ni­tion: “In­tel­li­gence mea­sures an agent’s abil­ity to achieve goals in a wide range of en­vi­ron­ments.” How­ever, that defi­ni­tion, taken liter­ally, would ap­pear to also in­clude “non-cog­ni­tive” ca­pa­bil­ities and re­sources, such as wealth or phys­i­cal strength, which we in­stead in­clude as part of “power”. (For more, see In­tel­li­gence vs. other ca­pa­bil­ities and re­sources.)

    Our use of “in­tel­li­gence” also lines up fairly well with how some peo­ple use “wis­dom” (e.g., in Bostrom, Dafoe, and Flynn). How­ever, at times “wis­dom” seems to also im­plic­itly in­clude some­thing like “benev­olence”. ↩︎

  13. Note that the way we’ve defined “power” means that the same non-in­tel­lec­tual abil­ity or re­source may af­fect one ac­tor’s power more than an­other, as it may be more use­ful given one plan than given an­other. See also foot­note 11, the con­cept of asym­met­ric weapons, and Carl Shul­man’s com­ment (which prompted me to add this foot­note). ↩︎

  14. One caveat to this is that ac­tors may be able to use cer­tain types of power to, in effect, “buy more in­tel­li­gence”, and thereby im­prove how well-al­igned their plans are with their goals. For ex­am­ple, the Open Philan­thropy Pro­ject can use money to hire ad­di­tional re­search an­a­lysts and thereby im­prove their abil­ity to de­ter­mine which cause ar­eas, in­ter­ven­tions, grantees, etc. they should sup­port in or­der to best ad­vance their val­ues. ↩︎

  15. As noted in foot­note 7, there is room for un­cer­tainty about whether we should fo­cus on the goal of im­prov­ing the long-term fu­ture in the first place. Ad­di­tion­ally, im­prov­ing benev­olence may of­ten in­volve moral ad­vo­cacy, and there’s room for de­bate about how im­por­tant, tractable, ne­glected, or “zero- vs pos­i­tive-sum” moral ad­vo­cacy is (for re­lated dis­cus­sion, see Chris­ti­ano and Bau­mann). ↩︎

  16. Both graphs are of course rough ap­prox­i­ma­tions, for illus­tra­tive pur­poses only. Pre­cise lo­ca­tions, num­bers, and in­ten­si­ties of each colour should not be taken too liter­ally. We’ve ar­bi­trar­ily cho­sen to make each scale start at 0, but the same ba­sic con­clu­sions could also be reached if the scales were made to ex­tend into nega­tive num­bers. ↩︎