What Should the Average EA Do About AI Alignment?

I’m try­ing to get a han­dle on what ad­vice to give peo­ple who are con­vinced AI is a prob­lem wor­thy of their time, *prob­a­bly* the most im­por­tant prob­lem, but are not sure if they have the tal­ent nec­es­sary to con­tribute.

A trend­ing school of thought is “AI Align­ment needs care­ful, clever, agenty thinkers. ‘Hav­ing the cor­rect opinion’ is not that use­ful. There is no­body who can tell you what ex­actly to do, be­cause no­body knows. We need peo­ple who can figure out what to do, in a very messy, challeng­ing prob­lem.”

This sort of makes sense to me, but it seems like only a few sorts of peo­ple can re­al­is­ti­cally con­tribute in this fash­ion (even given growth mind­set con­sid­er­a­tions). It also seems like, even if most peo­ple could con­tribute, it doesn’t provide very good next-ac­tions to peo­ple who have reached the “okay, this is im­por­tant” stage, but who aren’t (yet?) ready to change their ca­reer di­rec­tion.

Here is the ad­vice I cur­rently give, fol­lowed by the back­ground as­sump­tions that prompted it. I’m look­ing for peo­ple to challenge me on any of these:

Op­tions for the non-or-min­i­mally-tech­ni­cal-ish:

1) Donate. (1%, or more if you can do so with­out sac­ri­fic­ing the abil­ity to take valuable fi­nan­cial risks to fur­ther your ca­reer. MIRI, FHI, 80k and CFAR seem like the most cred­ible ways to turn money into more AI Align­ment ca­reer cap­i­tal)

2) Ar­range your life such that you can eas­ily iden­tify vol­un­teer op­por­tu­ni­ties for grunt­work, op­er­a­tions, or other non­tech­ni­cal skills for AI safety orgs, and ded­i­cate enough time and at­ten­tion to helping with that grunt­work that you are more of an as­set than a bur­den. (i.e. helping to run con­fer­ences and work­shops). To help with AI spe­cific things, it seems nec­es­sary to be in the Bay, Bos­ton, Oxford, Cam­bridge or Lon­don.

3a) Em­bark on pro­jects or ca­reer paths that will cause you to gain deep skills, and in par­tic­u­lar, train the habit/​skill of notic­ing things that need do­ing, and proac­tively de­vel­op­ing solu­tions to ac­com­plish them. (Th­ese pro­jects/​ca­reers can be pretty ar­bi­trary. To even­tu­ally tie them back into AI, you need to get good enough that you’ll ei­ther be able help found a new org or provide rare skills to an ex­ist­ing org)

3b) Ideally, choose pro­jects that in­volve work­ing to­gether in groups, that re­quire you to re­solve differ­ences in opinion on how to use scarce re­sources, and which re­quire you to in­ter­act­ing with other groups with sub­tly differ­ent goals. Prac­tice co­or­di­na­tion skills mind­fully.

4) Provide a read­ing list of blogs and so­cial-me­dia feeds to stay up-to-date on the more ac­cessible, less tech­ni­cally de­mand­ing thoughts re­lat­ing to AI Safety. Prac­tice think­ing crit­i­cally on your own about them. (this doesn’t re­ally come with an ob­vi­ous “Part 2” that trans­lates that into mean­ingful ac­tion on its own)

If tech­ni­cal-ish, and/​or will­ing to learn a LOT

5) Look at the MIRI and 80k AI Safety syl­labus, and see if how much of it looks like some­thing you’d be ex­cited to learn. If ap­pli­ca­ble to you, con­sider div­ing into that so you can con­tribute to the cut­ting edge of knowl­edge.

6) If you’re a tal­ented pro­gram­mer, learn a lot about ML/​Deep Learn­ing and then stay up to date on the lat­est ac­tual AI re­search, so you can po­si­tion your­self at the top AI com­pa­nies and po­ten­tially have in­fluence with them on which di­rec­tion they go.

An im­por­tant ques­tion I’d like to an­swer is “how do can you tell if it makes sense to al­ter your ca­reer in pur­suit of #5 and #6?”? This is very non-ob­vi­ous to me.

I talk to a lot of peo­ple that seem roooooughly analagous to my­self, ie. pretty smart but not ex­tremely smart. In my case I think I have a cred­ible claim on “com­mu­nity build­ing” be­ing my com­par­a­tive ad­van­tage, but I no­tice a lot of peo­ple de­fault to “be a com­mu­nity per­son or in­fluencer”, and I’m re­ally wary of a de­ci­sion tree that out­puts a tower of meta-com­mu­nity-stuff for any­one who’s not ob­vi­ously ex­pert at any­thing else. I’d like to have bet­ter, fleshed out, scal­able sug­ges­tions for peo­ple fairly similar to me.

Back­ground assumptions

Var­i­ous things that fed into the above recom­men­da­tions (some­times di­rectly, some­times in­di­rectly). This is a liv­ing doc­u­ment that I’ll up­date as peo­ple per­suade me oth­er­wise. Again, ap­pre­ci­ate get­ting challenged on any of these.

AI Timelines and Goals

AI timelines are any­where be­tween 5 years (if Deep­Mind is more ad­vanced than they’re tel­ling any­one), 20 years (if it turns out gen­eral AI is only a cou­ple break­throughs away from cur­rent Deep Learn­ing trends, and we’re (un)lucky on how soon those break­throughs come), or much longer if Gen­eral AI turns out to be harder. We should be pre­pared for each pos­si­bil­ity.

Even­tu­ally, all of our efforts will need to trans­late into the abil­ity into one of the fol­low­ing:

- the abil­ity to de­velop in­sights about AI Align­ment
—the abil­ity to cause AI re­search to be safely al­igned
—the abil­ity to stop or slow down AI re­search un­til it can be safely aligned


- MIRI seems like the most shovel-ready in­stance of “ac­tual AI Safety re­search”. It’s not ob­vi­ous to me whether MIRI is do­ing the best work, but they seem to be at least do­ing good work, and they do seem un­der­funded, and fund­ing them seems like the most straight­for­ward way to turn money into more pro­fes­sional AI re­searchers.

- FHI is a con­tender for sec­ond-best fund­ing-tar­get for X-risk re­duc­tion, in­clud­ing some thought about AI al­ign­ment.

− 80k, CFAR and Lev­er­age are the orgs I know of that seem to be con­cretely at­tempt­ing to solve the “ca­reer cap­i­tal gap”, with differ­ent strate­gies. They each have el­e­ments that seem promis­ing to me. I’m sure what their re­spec­tive fund­ing con­straints are. (Note: I re­cently be­came a bit more in­ter­ested in Lev­er­age than I had been, but ex­am­in­ing Lev­er­age is a blog­post unto it­self and I’m not go­ing to try do­ing so here)

- The Far Fu­ture Fund (re­cently an­nounced, run by Nick Beck­stead) may be a good way to out­source your dona­tion de­ci­sion.

Ca­reer Cap­i­tal, Agency and Self Improvement

- An im­por­tant limit­ing reagent is “peo­ple able to be agents.” More than any sin­gle skil­lset, we need peo­ple who are able to look at or­ga­ni­za­tions and wor­ld­states, figure out what’s not be­ing done yet, figure out if they cur­rently have the skills to do it, and backchain from that to be­ing able to be­come the sort of peo­ple who have the skills to do that.

- To self-im­prove the fastest, as a per­son and as an org, you need high qual­ity feed­back loops.

- In my ex­pe­rience, there is a crit­i­cal thresh­old be­tween an “agent” and a non-agent. Peo­ple get ac­ti­vated as agents when they a) have a con­crete pro­ject to work on that seems im­por­tant to them that’s above their cur­rent skill level, and b) have some high sta­tus men­tor-figure who takes time out of their day to tell them in a se­ri­ous voice “this pro­ject you are work­ing on is im­por­tant.” (The lat­ter step is not nec­es­sary but it seems to help a lot. Note: this is NOT a men­tor figure who nec­es­sar­ily spends a lot of time train­ing you. They are Gan­dalf, tel­ling you your mis­sion is im­por­tant and they be­lieve in you, and then mostly stay­ing out of the way)

(Ac­tual longterm men­tor­ship is also su­per helpful but doesn’t seem to be the limit­ing is­sue)

- Beyond “be an agent”, we do need highly skil­led peo­ple at a va­ri­ety of spe­cific skills—both be­cause AI Safety orgs need them, and be­cause high skill al­lows you to get a job at an AGI re­search in­sti­tu­tion.

- De­spite at­tempt­ing to achieve this for sev­eral years, it’s not ob­vi­ous that CFAR has de­vel­oped the abil­ity to pro­duce agents, but it’s suc­ceeded (at least slightly) at at­tract­ing ex­ist­ing agents, train­ing them in some skills, and fo­cus­ing them on the right prob­lems.

Think­ing Critically

- We need peo­ple who can think crit­i­cally, and who spend time/​at­ten­tion be­ing able to think crit­i­cally and deeply about the right things.

- Think­ing use­fully crit­i­cally re­quires be­ing up to speed on what other peo­ple are think­ing, so you aren’t du­pli­cat­ing work.

- It is cur­rently very hard to keep up with ALL the differ­ent de­vel­op­ments across the AI/​EA/​Ca­reer-Cap­i­tal-Build­ing spaces. Both be­cause the up­dates come from all over the in­ter­net (and some­times in per­son), and be­cause peo­ple’s writ­ing is of­ten ver­bose and in­con­cise.

- It is pos­si­ble for the av­er­age EA to learn to think more crit­i­cally, but it re­quires sig­nifi­cant time investment


- Co­or­di­na­tion prob­lems are ex­traor­di­nar­ily hard. Hu­man­ity es­sen­tially failed the “Nu­clear Weapons test” (i.e. we sur­vived the Cold War, but we eas­ily might not have. Squeak­ing by the with a C- is not ac­cept­able).

- Some peo­ple have ar­gued the AI prob­lem is much harder than Nukes, which isn’t clear to me, (in the longterm you do need to stop ev­ery­one ever from de­vel­op­ing un­safe AI, but it seems like the crit­i­cal pe­riod is the win­dow wherein AGI is first pos­si­ble, where it’ll be some­thing like 6-20 com­pa­nies work­ing on it at once)

- The Ra­tion­al­ity and EA com­mu­ni­ties aren’t ob­vi­ously worse than the av­er­age com­mu­nity at co­or­di­na­tion, but they are cer­tainly not much bet­ter. And EAs are definitely not bet­ter than-av­er­age at in­duc­ing co­or­di­na­tion/​co­op­er­a­tion among dis­parate groups with differ­ent goals that aren’t al­igned with us.

- If your goal is to in­fluence orgs or AGI re­searchers, you need to make sure you’re ac­tu­ally fol­low­ing a path that leads to real in­fluence. (i.e. “You can net­work your way into be­ing Elon Musk’s friend who he in­vites over for din­ner, but that doesn’t mean he’ll listen to you about AI safety. The same goes for net­work­ing your way onto the GoogleBrain team or the Google AI Ethics board. Have a clear model of in­fluence and how much of it you cred­ibly have.”)

-Main­stream poli­tics is even harder than co­or­di­nat­ing cor­po­ra­tions, and to a first ap­prox­i­ma­tion is use­less for pur­poses of AI al­ign­ment.

Open Questions

This is mostly a re­cap.

0) Is any­thing in my frame­work grossly wrong?

1) My pri­mary ques­tion is “how do we filter for peo­ple who should con­sider drop­ping ev­ery­thing and fo­cus­ing on the tech­ni­cal as­pects of AI Safety, or se­ri­ously pur­sue ca­reers that will po­si­tion them to in­fluence AGI re­search in­sti­tu­tions?” Th­ese seem like the most im­por­tant things to ac­tu­ally out­put, and it seems most im­por­tant for those peo­ple to cul­ti­vate par­tic­u­lar types of crit­i­cal think­ing, tech­ni­cal skill and abil­ity-to-in­fluence.

For peo­ple who are not well suited, or not yet ready to do 1), how can we ei­ther:

2) Make it eas­ier for them to trans­late marginal effort into mean­ingful con­tri­bu­tion, or cre­at­ing a clearer path to­wards:

3) Level up to the point where they are able to take in the en­tire field, and gen­er­ate use­ful things to do (with­out re­quiring much effort from other heav­ily in­volved peo­ple whose time is scarce).

Po­ten­tial Fur­ther Reading

I have not read all of these, so can­not speak to which are most im­por­tant, but I think it’s use­ful to at least skim the con­tents of each of them so you have a rough idea of the ideas at play. I’m in­clud­ing them here mostly for easy refer­ence.

(If some­one wanted to gen­er­ate a 1-3 sen­tence sum­mary of each of these and in­di­cate who the tar­get au­di­ence is, I’d be happy to edit that in. I hope­fully will even­tu­ally have time to do that my­self but it may be a while)

MIRI’s Re­search Guide

80,000 Hours AI Safety Syllabus

UC Berkeley Cen­ter for Hu­man Com­pat­i­ble AI Bibliography

Case Study of CFAR’s Effectiveness

AI Im­pacts Timelines and Strate­gies (ex­am­ples of how to think strate­gi­cally given differ­ent AI timelines)

Con­crete Prob­lems in AI Safety

OpenAI’s Blog

Agen­tFoun­da­tions.org (this is sort of a stack-overflow /​ tech­ni­cal dis­cus­sion fo­rum for dis­cussing con­cepts rele­vant to AI al­ign­ment)

De­liber­ate Grad School