Why I’m donating to MIRI this year

I’ve been think­ing about my an­nual dona­tion, and I’ve de­cided to donate to MIRI this year. I haven’t pre­vi­ously donated to MIRI, and my rea­sons for do­ing so now are some­what nu­anced, so I thought they were worth ex­plain­ing.

I have pre­vi­ously thought that MIRI was tak­ing a some­what less-than-ideal ap­proach on AI safety, and they were not my preferred dona­tion tar­get. Three things have changed:

  1. My opinion of the ap­proach has changed a lit­tle (ac­tu­ally not much);

  2. I think they are mov­ing to­wards a bet­ter ver­sion of their ap­proach (more em­pha­sis on good ex­pla­na­tions of their work);

  3. The back­ground dis­tri­bu­tion of work and op­por­tu­ni­ties in AI safety has shifted sig­nifi­cantly.

Over­all I do not fully en­dorse MIRI’s work. I some­what agree with the per­spec­tive of the Open Philan­thropy Pro­ject’s re­view, al­though I am gen­er­ally more pos­i­tive to­wards MIRI’s work:

  • I agree with those of their tech­ni­cal ad­vi­sors who thought it could be benefi­cial for po­ten­tial risks from ad­vanced AI to solve the prob­lems on the re­search agenda, rather than those who did not.

  • I thought that the as­sess­ment of the level of progress was quite un­fair.

    • A key sum­mary sen­tence, “One way of sum­ma­riz­ing our im­pres­sion of this con­ver­sa­tion is that the to­tal re­viewed out­put is com­pa­rable to the out­put that might be ex­pected of an in­tel­li­gent but un­su­per­vised grad­u­ate stu­dent over the course of 1-3 years.”, in par­tic­u­lar, seems be­mus­ingly un­fair:

      • I think I see sig­nifi­cant im­por­tance in the work of giv­ing tech­ni­cal fram­ings of the prob­lems in the first place. In some cases once this is done the solu­tions are not tech­ni­cally hard; I won­der if the OpenPhil re­view was con­cerned rel­a­tively more with the work on solu­tions.

      • The point about su­per­vi­sion felt po­ten­tially a bit con­fused. I think it’s sig­nifi­cantly eas­ier to make quick progress when flesh­ing out the de­tails of an es­tab­lished field than when try­ing to give a good ground­ing for a new field

        • On the other hand, I do think the clar­ity of writ­ing in some of MIRI’s out­puts has not been great, and that this is po­ten­tially some­thing su­per­vi­sion could have helped with. I think they’ve been im­prov­ing on this.

      • My refer­ence class is PhD stu­dents in math­e­mat­ics at Oxford, a group I’m fa­mil­iar with. I find it plau­si­ble that this would line up with the out­put of some of the most tal­ented such stu­dents, but I thought the word­ing im­plied com­par­i­son to a sig­nifi­cantly lower bar than this.

      • (Edit: see also this use­ful dis­cus­sion with Ja­cob Stein­hardt in the com­ment thread)

Th­ese views are based on sev­eral con­ver­sa­tions with MIRI re­searchers over ap­prox­i­mately the last three years, and read­ing a frac­tion of their pub­lished out­put.

Two or three years ago, I thought that it was im­por­tant that AI safety en­gage sig­nifi­cantly more with main­stream AI re­search, and build to­wards hav­ing an aca­demic field which at­tracted the in­ter­est of many re­searchers. It seemed that MIRI’s work was quite far from op­ti­mised for do­ing that. I thought that the ab­stract work MIRI was do­ing might be im­por­tant even­tu­ally, but that it was less time-crit­i­cal than field-build­ing.

Now, the work to build a field which ties into ex­ist­ing AI re­search is hap­pen­ing, and is scal­ing up quite quickly. Ex­am­ples:

I ex­pect this trend to con­tinue for at least a year or two. More­over I think this work is sig­nifi­cantly tal­ent-con­strained (and ca­pac­ity-con­strained) rather than fund­ing-con­strained. In con­trast, MIRI has been de­vel­op­ing a tal­ent pipeline and re­cently failed to reach its fund­ing tar­get, so marginal funds are likely to have a sig­nifi­cant effect on ac­tual work done over the com­ing year. I think that this fund­ing con­sid­er­a­tion rep­re­sents a sig­nifi­cant-but-not-over­whelming point in favour of MIRI over other tech­ni­cal AI safety work (per­haps a fac­tor of be­tween 5 and 20 if con­sid­er­ing al­lo­cat­ing money com­pared to al­lo­cat­ing labour, but I’m pretty un­cer­tain about this num­ber).

A few years ago, I was not con­vinced that MIRI’s re­search agenda was what would be needed to solve AI safety. To­day, I re­main not con­vinced. How­ever, I’m not con­vinced by any agenda. I think we should pur­su­ing a port­fo­lio of differ­ent re­search agen­das, fo­cus­ing in each case on not op­ti­mis­ing for tech­ni­cal re­sults in the short term, but op­ti­mis­ing for a solid foun­da­tion that we can build a field on and at­tract fu­ture tal­ent to. As MIRI’s work looks to be oc­cu­py­ing a much smaller slice of the to­tal work go­ing for­wards than it has his­tor­i­cally, adding re­sources to this part of the port­fo­lio looks rel­a­tively more valuable than be­fore. More­over MIRI has be­come sig­nifi­cantly bet­ter at clear com­mu­ni­ca­tion of its agenda and work—which I think is cru­cial for this ob­jec­tive of build­ing a solid foun­da­tion—and I know they are in­ter­ested in con­tin­u­ing to im­prove on this di­men­sion.

The com­bi­na­tion of these fac­tors, along with the tra­di­tional case for the im­por­tance of AI safety as a field, makes me be­lieve that MIRI may well be the best marginal use of money to­day.

Ways I think this might be a mis­take:

  • Op­por­tu­nity cost of money

    • I’m fairly happy prefer­ring fund­ing MIRI to any other di­rect tech­ni­cal work in AI safety I know of.

      • There might be other op­por­tu­ni­ties I am un­aware of. For ex­am­ple I would like more peo­ple to work on Paul Chris­ti­ano’s agenda. I don’t know a way to fund that di­rectly (though I know some MIRI staff were look­ing at work­ing on it a few months ago).

    • It seems plau­si­ble that money could be bet­ter spent by 80,000 Hours or CFAR in helping to de­velop a broader pipeline of tal­ent for the field. How­ever, I think that a sig­nifi­cant bot­tle­neck is the de­vel­op­ment of re­ally solid agen­das, and I think MIRI may be well-placed to do this.

    • Given the re­cent in­flux of money, an­other field than AI safety might be the best marginal use of re­sources. I per­son­ally think that pri­ori­ti­sa­tion re­search is ex­tremely im­por­tant, and would con­sider donat­ing to the Cen­tre for Effec­tive Altru­ism to sup­port this in­stead of AI safety.

  • Op­por­tu­nity cost of re­searchers’ time

    • Per­haps MIRI will em­ploy re­searchers to work on a sub­op­ti­mal agenda, and they would oth­er­wise get jobs work­ing on a more im­por­tant part of AI safety (if those other parts are in­deed tal­ent con­strained).

    • How­ever, I think that the back­ground of MIRI re­searchers is of­ten not the same as would be needed for work on (say) more ma­chine-learn­ing ori­ented re­search agen­das.

  • Failing to shift MIRI’s focus

    • If MIRI were do­ing work that was use­ful but sub­op­ti­mal, one might think that failure to reach fund­ing tar­gets could get them to re-eval­u­ate. How­ever:

      • I think they are already shift­ing their fo­cus in a di­rec­tion I en­dorse.

      • With­hold­ing fund­ing is a fairly non-co­op­er­a­tive way to try to achieve this. I’d pre­fer to give fund­ing, and sim­ply tell them my con­cerns.

Ex­tra mis­cel­la­neous fac­tors in favour of MIRI:

  • I should have some epistemic humility

    • I’ve had a num­ber of con­ver­sa­tions with MIRI re­searchers about the di­rec­tion of their re­search, in mod­er­ate depth. I fol­low and agree with some of the things they are say­ing. In other cases, I don’t fol­low the full force of the in­tu­itions driv­ing their choices.

      • The fact that they failed to ex­plain it to me so that I could fully fol­low de­creases my cre­dence that what they have in mind is both nat­u­ral and cor­rect (rel­a­tive to be­fore they tried this), since I think it tends to be eas­ier to find good ex­pla­na­tions for nat­u­ral and cor­rect things.

        • This would be a stronger up­date for me, ex­cept that I’ve also had the ex­pe­rience of peo­ple at MIRI re­peat­edly failing to con­vey some­thing to me, and then suc­ceed­ing over a year later. A clean case of this is that I pre­vi­ously be­lieved de­ci­sion the­ory was pretty ir­rele­vant for AI safety, and I now see mechanisms for it to mat­ter. This is good ev­i­dence that at least in some cases they have ac­cess to in­tu­itions which are cor­rect about some­thing im­por­tant, even when they’re un­able to clearly com­mu­ni­cate them.

      • In these con­ver­sa­tions I’ve also been able to as­sess their epistemics and gen­eral ap­proach.

        • I don’t fully en­dorse these, but they seem some­what rea­son­able. I also think some of my differ­ences arise from differ­ences in com­mu­ni­ca­tion style.

        • Some gen­eral trust in their epistemics leads me to have some be­lief that there are gen­uinely use­ful in­sights that they are pur­su­ing, even when they aren’t yet able to clearly com­mu­ni­cate them.

    • (Edit: see also this dis­cus­sion with Anna Sala­mon in the com­ment thread.)

  • Train­ing and com­mu­nity building

    • I think MIRI has a cul­ture which en­courages some use­ful per­spec­tives on AI safety (I’m roughly point­ing to­wards what they de­scribe as “se­cu­rity mind­set”).

      • I’m less con­vinced than they that this mind­set is par­tic­u­larly cru­cial, rel­a­tive to, e.g. an en­g­ineer­ing mind­set, but I do think there is a risk of it be­ing un­der-rep­re­sented in a much larger AI safety com­mu­nity.

    • I think that one of the more effec­tive ways to en­courage deep shar­ing of cul­ture and per­spec­tive be­tween re­search groups is ex­change of staff.

    • If MIRI has more staff in the short term, this will al­low greater dis­per­sal of this per­spec­tive in the next few years.

  • Money for ex­plic­itly long-term work will tend to be neglected

    • As AI sys­tems be­come more pow­er­ful over the com­ing decades, there will be in­creas­ing short-term de­mand for AI safety work. I think that in many cases high-qual­ity work pro­duc­ing ro­bust solu­tions to short-term prob­lems could be helpful for some of the longer-term prob­lems. How­ever there will be lots of short-term in­cen­tives to fo­cus on short-term prob­lems, or even long-term prob­lems with short-term analogues. This means that al­tru­is­tic money may have more lev­er­age over the long-term sce­nar­ios.

Over­all, I don’t think we un­der­stand the challenges to come well enough that we should com­mit to cer­tain ap­proaches yet. I think MIRI has some per­spec­tives that I’d like to see ex­plored and ex­plained fur­ther, I think they’re mov­ing in a good di­rec­tion, and I’m ex­cited to see what they’ll man­age in the next cou­ple of years.

Dis­claimers: Th­ese rep­re­sent my per­sonal views, not those of my em­ploy­ers. Sev­eral MIRI staff are known per­son­ally to me.

[*] There are ac­tu­ally some tax ad­van­tages to my donat­ing to CEA by re­quest­ing a lower salary. This pre­vi­ously swayed me to donate to CEA, but I think I ac­tu­ally care more about the pos­si­ble bias. How­ever, if some­one who was plan­ning to donate CEA wants to do a dona­tion switch with me, we could re­cover and split these benefits, prob­a­bly worth a few hun­dred dol­lars. Please mes­sage or email me if in­ter­ested.


The value
is not of type