Long-Term Future Fund: April 2019 grant recommendations

Please note that the fol­low­ing grants are only recom­men­da­tions, as all grants are still pend­ing an in­ter­nal due dili­gence pro­cess by CEA.

This post con­tains our al­lo­ca­tion and some ex­plana­tory rea­son­ing for our Q1 2019 grant round. We opened up an ap­pli­ca­tion for grant re­quests ear­lier this year which was open for about one month, af­ter which we re­ceived an unan­ti­ci­pated large dona­tion of about $715k. This caused us to re­open the ap­pli­ca­tion for an­other two weeks. We then used a mix­ture of in­de­pen­dent vot­ing and con­sen­sus dis­cus­sion to ar­rive at our cur­rent grant al­lo­ca­tion.

What is listed be­low is only a set of grant recom­men­da­tions to CEA, who will run these by a set of due-dili­gence tests to en­sure that they are com­pat­i­ble with their char­i­ta­ble ob­jec­tives and that mak­ing these grants will be lo­gis­ti­cally fea­si­ble.

Grant Recipients

Each grant re­cip­i­ent is fol­lowed by the size of the grant and their one-sen­tence de­scrip­tion of their pro­ject.

  • An­thony Aguirre ($70,000): A ma­jor ex­pan­sion of the Me­tac­u­lus pre­dic­tion plat­form and its community

  • Tessa Alex­a­nian ($26,250): A biorisk sum­mit for the Bay Area biotech in­dus­try, DIY biol­o­gists, and biose­cu­rity researchers

  • Sha­har Avin ($40,000): Scal­ing up sce­nario role-play for AI strat­egy re­search and train­ing; im­prov­ing the pipeline for new researchers

  • Lu­cius Cavi­ola ($50,000): Con­duct­ing post­doc­toral re­search at Har­vard on the psy­chol­ogy of EA/​long-termism

  • Con­nor Flex­man ($20,000): Perform­ing in­de­pen­dent re­search in col­lab­o­ra­tion with John Salvatier

  • Ozzie Gooen ($70,000): Build­ing in­fras­truc­ture for the fu­ture of effec­tive fore­cast­ing efforts

  • Jo­hannes Hei­decke ($25,000): Sup­port­ing as­piring re­searchers of AI al­ign­ment to boost them­selves into productivity

  • David Girardo ($30,000): A re­search agenda rigor­ously con­nect­ing the in­ter­nal and ex­ter­nal views of value synthesis

  • Nikhil Ku­na­puli ($30,000): A study of safe ex­plo­ra­tion and ro­bust­ness to dis­tri­bu­tional shift in biolog­i­cal com­plex systems

  • Ja­cob Lager­ros ($27,000): Build­ing in­fras­truc­ture to give X-risk re­searchers su­perfore­cast­ing abil­ity with min­i­mal overhead

  • Lau­ren Lee ($20,000): Work­ing to pre­vent burnout and boost pro­duc­tivity within the EA and X-risk communities

  • Alex Lintz ($17,900): A two-day, ca­reer-fo­cused work­shop to in­form and con­nect Euro­pean EAs in­ter­ested in AI governance

  • Or­pheus Lum­mis ($10,000): Up­skil­ling in con­tem­po­rary AI tech­niques, deep RL, and AI safety, be­fore pur­su­ing a ML PhD

  • Vy­ach­es­lav Matyuhin ($50,000): An offline com­mu­nity hub for ra­tio­nal­ists and EAs

  • Te­gan McCaslin ($30,000): Con­duct­ing in­de­pen­dent re­search into AI fore­cast­ing and strat­egy questions

  • Robert Miles ($39,000): Pro­duc­ing video con­tent on AI alignment

  • Anand Srini­vasan ($30,000): For­mal­iz­ing per­cep­tual com­plex­ity with ap­pli­ca­tion to safe in­tel­li­gence amplification

  • Alex Turner ($30,000): Build­ing to­wards a “Limited Agent Foun­da­tions” the­sis on mild op­ti­miza­tion and corrigibility

  • Eli Tyre ($30,000): Broad pro­ject sup­port for ra­tio­nal­ity and com­mu­nity build­ing interventions

  • Mikhail Yagudin ($28,000): Giv­ing copies of Harry Pot­ter and the Meth­ods of Ra­tion­al­ity to the win­ners of EGMO 2019 and IMO 2020

  • CFAR ($150,000): Un­re­stricted donation

  • MIRI ($50,000): Un­re­stricted donation

  • Ought ($50,000): Un­re­stricted donation

To­tal dis­tributed: $923,150

Grant Rationale

Here we ex­plain the pur­pose for each grant and sum­ma­rize our rea­son­ing be­hind their recom­men­da­tion. Each sum­mary is writ­ten by the fund mem­ber who was most ex­cited about recom­mend­ing the rele­vant grant (plus some con­straints on who had time available to write up their rea­son­ing). Th­ese differ a lot in length, based on how much available time the differ­ent fund mem­bers had to ex­plain their rea­son­ing.

Wri­te­ups by He­len Toner

Alex Lintz ($17,900)

A two-day, ca­reer-fo­cused work­shop to in­form and con­nect Euro­pean EAs in­ter­ested in AI governance

Alex Lintz and some col­lab­o­ra­tors from EA Zürich pro­posed or­ga­niz­ing a two-day work­shop for EAs in­ter­ested in AI gov­er­nance ca­reers, with the goals of giv­ing par­ti­ci­pants back­ground on the space, offer­ing ca­reer ad­vice, and build­ing com­mu­nity. We agree with their as­sess­ment that this space is im­ma­ture and hard to en­ter, and be­lieve their sug­gested plan for the work­shop looks like a promis­ing way to help par­ti­ci­pants ori­ent to ca­reers in AI gov­er­nance.

Wri­te­ups by Matt Wage

Tessa Alex­a­nian ($26,250)

A biorisk sum­mit for the Bay Area biotech in­dus­try, DIY biol­o­gists, and biose­cu­rity researchers

We are fund­ing Tessa Alex­a­nian to run a one day biose­cu­rity sum­mit, im­me­di­ately fol­low­ing the SynBioBeta in­dus­try con­fer­ence. We have also put Tessa in touch with some ex­pe­rienced peo­ple in the biose­cu­rity space who we think can help make sure the event goes well.

Sha­har Avin ($40,000)

Scal­ing up sce­nario role-play for AI strat­egy re­search and train­ing; im­prov­ing the pipeline for new researchers

We are fund­ing Sha­har Avin to help him hire an aca­demic re­search as­sis­tant and for other mis­cel­la­neous re­search ex­penses. We think pos­i­tively of Sha­har’s past work (for ex­am­ple this re­port), and mul­ti­ple peo­ple we trust recom­mended that we fund him.

Lu­cius Cavi­ola ($50,000)

Con­duct­ing post­doc­toral re­search at Har­vard on the psy­chol­ogy of EA/​long-termism

We are fund­ing Lu­cius Cavi­ola for a 2-year post­doc at Har­vard work­ing with Pro­fes­sor Joshua Greene. Lu­cius plans to study the psy­chol­ogy of effec­tive al­tru­ism and long-ter­mism, and an EA aca­demic we trust had a pos­i­tive im­pres­sion of him. We are split­ting the cost of this pro­ject with the EA Meta Fund be­cause some of Cavi­ola’s re­search (on effec­tive al­tru­ism) is a bet­ter fit for the Meta Fund while some of his re­search (on long-ter­mism) is a bet­ter fit for our fund.

Ought ($50,000)

We funded Ought in our last round of grants, and our rea­son­ing for fund­ing them in this round is largely the same. Ad­di­tion­ally, we wanted to help Ought di­ver­sify its fund­ing base be­cause it cur­rently re­ceives al­most all its fund­ing from only two sources and is try­ing to change that.

Our com­ments from last round:

Ought is a non­profit aiming to im­ple­ment AI al­ign­ment con­cepts in real-world ap­pli­ca­tions. We be­lieve that Ought’s ap­proach is in­ter­est­ing and worth try­ing, and that they have a strong team. Our un­der­stand­ing is that hiring is cur­rently more of a bot­tle­neck for them than fund­ing, so we are only mak­ing a small grant. Part of the aim of the grant is to show Ought as an ex­am­ple of the type of or­ga­ni­za­tion we are likely to fund in the fu­ture.

Wri­te­ups by Alex Zhu

Nikhil Ku­na­puli ($30,000)

A study of safe ex­plo­ra­tion and ro­bust­ness to dis­tri­bu­tional shift in biolog­i­cal com­plex systems

Nikhil Ku­na­puli is do­ing in­de­pen­dent de­con­fu­sion re­search for AI safety. His ap­proach is to de­velop bet­ter foun­da­tional un­der­stand­ings of var­i­ous con­cepts in AI safety, like safe ex­plo­ra­tion and ro­bust­ness to dis­tri­bu­tional shift, by ex­plor­ing these con­cepts in com­plex sys­tems sci­ence and the­o­ret­i­cal biol­ogy, do­mains out­side of ma­chine learn­ing for which these con­cepts are also ap­pli­ca­ble. To quote an illus­tra­tive pas­sage from his ap­pli­ca­tion:

When an or­ganism within an ecosys­tem de­vel­ops a unique mu­ta­tion, one of sev­eral things can hap­pen. At the level of the or­ganism, the mu­ta­tion can ei­ther be neu­tral in terms of fit­ness, mal­adap­tive and lead­ing to re­duced re­pro­duc­tive suc­cess and/​or death, or adap­tive. For an adap­tive mu­ta­tion, the up­graded fit­ness of the or­ganism will change the fit­ness land­scape for all other or­ganisms within the ecosys­tem, and in re­sponse, the struc­ture of the ecosys­tem will ei­ther be per­turbed into a new at­trac­tor state or desta­bi­lized en­tirely, lead­ing to ecosys­tem col­lapse. Re­mark­ably, most mu­ta­tions do not kill their hosts, and most mu­ta­tions also do not lead to ecosys­tem col­lapse. This is ac­tu­ally sur­pris­ing when one con­sid­ers the stag­ger­ing com­plex­ity pre­sent within a sin­gle genome (tens of thou­sands of genes deeply in­ter­twined through ge­nomic reg­u­la­tory net­works) as well as an ecosys­tem (billions of or­ganisms oc­cu­py­ing unique niches and con­stantly co-evolv­ing). One would naïvely think that a sys­tem so com­plex must be highly sen­si­tive to change, and yet these sys­tems are ac­tu­ally sur­pris­ingly ro­bust. Na­ture some­how figured out a way to cre­ate ro­bust or­ganisms that could re­spond to and func­tion in a shift­ing en­vi­ron­ment, as well as how to build ecosys­tems in which or­ganisms could be free to safely ex­plore their ad­ja­cent pos­si­ble new forms with­out kil­ling all other species.

Nikhil spent a sum­mer do­ing re­search for the New England Com­plex Sys­tems In­sti­tute. He also spent 6 months as the cofounder and COO of an AI hard­ware startup, which he left be­cause he de­cided that di­rect work on AI safety is more ur­gent and im­por­tant.

I recom­mended that we fund Nikhil be­cause I think Nikhil’s re­search di­rec­tions are promis­ing, and be­cause I per­son­ally learn a lot about AI safety ev­ery time I talk with him. The qual­ity of his work will be as­sessed by re­searchers at MIRI.

Anand Srini­vasan ($30,000)

For­mal­iz­ing per­cep­tual com­plex­ity with ap­pli­ca­tion to safe in­tel­li­gence amplification

Anand Srini­vasan is do­ing in­de­pen­dent de­con­fu­sion re­search for AI safety. His an­gle of at­tack is to de­velop a frame­work that will al­low re­searchers to make prov­able claims about what spe­cific AI sys­tems can and can­not do, based off of fac­tors like their ar­chi­tec­tures and their train­ing pro­cesses. For ex­am­ple, AlphaGo can “only have thoughts” about pat­terns on Go boards and looka­heads, which aren’t ex­pres­sive enough to en­code thoughts about mal­i­cious takeover.

AI re­searchers can build safe and ex­tremely pow­er­ful AI sys­tems by rely­ing on in­tu­itive judg­ments of their ca­pa­bil­ities. How­ever, these in­tu­itions are non-rigor­ous and prone to er­ror, es­pe­cially since pow­er­ful op­ti­miza­tion pro­cesses can gen­er­ate solu­tions that are to­tally novel and un­ex­pected to hu­mans. Fur­ther­more, com­pet­i­tive dy­nam­ics will in­cen­tivize ra­tio­nal­iza­tion about which AI sys­tems are safe to de­ploy. Un­der fast take­off as­sump­tions, a sin­gle rogue AI sys­tem could lead to hu­man ex­tinc­tion, mak­ing it par­tic­u­larly un­re­li­able for us to rely ex­clu­sively on in­tu­itive judg­ments about which AI sys­tems are safe. Anand’s goal is to de­velop a frame­work that for­mal­izes these in­tu­itions well enough to per­mit fu­ture AI re­searchers to make prov­able claims about what fu­ture AI sys­tems can and can’t in­ter­nally rep­re­sent.

Anand was the CTO of an en­ter­prise soft­ware com­pany that he cofounded with me, where he man­aged a six-per­son en­g­ineer­ing team for two years. Upon leav­ing the com­pany, he de­cided to re­fo­cus his efforts to­ward build­ing safe AGI. Be­fore drop­ping out of MIT, Anand worked on Is­ing mod­els for fast image clas­sifi­ca­tion and fuzzy man­i­fold learn­ing (which was later in­de­pen­dently pub­lished as a top pa­per at NIPS).

I recom­mended that we fund Anand be­cause I think Anand’s re­search di­rec­tions are promis­ing, and I per­son­ally learn a lot about AI safety ev­ery time I talk with him. The qual­ity of Anand’s work will be as­sessed by re­searchers at MIRI.

David Girardo ($30,000)

A re­search agenda rigor­ously con­nect­ing the in­ter­nal and ex­ter­nal views of value synthesis

David Girardo is do­ing in­de­pen­dent de­con­fu­sion re­search for AI safety. His an­gle of at­tack is to elu­ci­date the on­tolog­i­cal prim­i­tives for rep­re­sent­ing hi­er­ar­chi­cal ab­strac­tions, draw­ing from his ex­pe­rience with type the­ory, cat­e­gory the­ory, differ­en­tial ge­om­e­try, and the­o­ret­i­cal neu­ro­science.

I de­cided to fund David be­cause I think David’s re­search di­rec­tions are very promis­ing, and be­cause I per­son­ally learn a lot about AI safety ev­ery time I talk with him. Tsvi Ben­son-Tilsen, a MIRI re­searcher, has also recom­mended that David get fund­ing. The qual­ity of David’s work will be as­sessed by re­searchers at MIRI.

Wri­te­ups by Oliver Habryka

I have a broad sense that fun­ders in EA tend to give lit­tle feed­back to or­ga­ni­za­tions they are fund­ing, as well as or­ga­ni­za­tions that they ex­plic­itly de­cided not to fund (usu­ally due to time con­straints). So in my write­ups be­low I tried to be as trans­par­ent as pos­si­ble in ex­plain­ing the real rea­sons for what caused me to be­lieve a grant was a good idea, what my biggest hes­i­ta­tions are, and took a lot of op­por­tu­ni­ties to ex­plain back­ground mod­els of mine that might help oth­ers get bet­ter at un­der­stand­ing my fu­ture de­ci­sions in this space.

For some of the grants be­low, I think there ex­ist more pub­li­cly defen­si­ble (or eas­ier to un­der­stand) ar­gu­ments for the grants that I recom­mended. How­ever I tried to ex­plain the ac­tual mod­els that drove my de­ci­sions for these grants, which are of­ten hard to put into a few para­graphs of text, and so I apol­o­gize in ad­vance for some of the ex­pla­na­tions be­low al­most cer­tainly be­ing a bit hard to un­der­stand.

Note that when I’ve writ­ten about how I hope a grant will be spent, this was in aid of clar­ify­ing my rea­son­ing and is in no way meant as a re­stric­tion on what the grant should be spent on. The only re­stric­tion is that it should be spent on the pro­ject they ap­plied for in some fash­ion, plus any fur­ther le­gal re­stric­tions that CEA re­quires.

Mikhail Yagudin ($28,000)

Giv­ing copies of Harry Pot­ter and the Meth­ods of Ra­tion­al­ity to the win­ners of EGMO 2019 and IMO 2020

From the ap­pli­ca­tion:

EA Rus­sia has the oral agree­ments with IMO [In­ter­na­tional Math Olympiad] 2020 (Saint Peters­burg, Rus­sia) & EGMO [Euro­pean Girls’ Math­e­mat­i­cal Olympiad] 2019 (Kyiv, Ukraine) or­ga­niz­ers to give HPMORs [copies of Harry Pot­ter and the Meth­ods of Ra­tion­al­ity] to the medal­ists of the com­pe­ti­tions. We would also be able to add an EA /​ ra­tio­nal­ity leaflet made by CFAR (I con­tacted Ti­mothy Tel­leen-Law­ton on that mat­ter).

My thoughts and reasoning

[Edit & clar­ifi­ca­tion: The books will be given out by the or­ganisers of the IMO and EGMO as prizes for the 650 peo­ple who got far enough to par­ti­ci­pate, all of which are “medal­ists”.]

My model for the im­pact of this grant roughly breaks down into three ques­tions:

  1. What effects does read­ing HPMOR have on peo­ple?

  2. How good of a tar­get group are Math Olympiad win­ners for these effects?

  3. Is the team com­pe­tent enough to ex­e­cute on their plan?

What effects does read­ing HPMOR have on peo­ple?

My mod­els of the effects of HPMOR stem from my em­piri­cal ob­ser­va­tions and my in­side view on ra­tio­nal­ity train­ing.

  • Em­piri­cally, a sub­stan­tial num­ber of top peo­ple in our com­mu­nity have (a) en­tered due to read­ing and feel­ing a deep con­nec­tion to HPMOR and (b) at­tributed their ap­proach to work­ing on the long term fu­ture in sub­stan­tial part to the in­sights they learned from read­ing HPMOR. This in­cludes some in­di­vi­d­u­als re­ceiv­ing grants on this list, and some in­di­vi­d­u­als on the grant-mak­ing team.

  • I also weight here my in­side view of the skills that HPMOR helps to teach. I’ll try to point at the things I think HPMOR does ex­cep­tion­ally and uniquely well at, though I find it a bit hard to make my mod­els fully ex­plicit here in an ap­pro­pri­ate amount of space.

    • The most pow­er­ful tools that hu­man­ity has dis­cov­ered so far are meth­ods for think­ing quan­ti­ta­tively and sci­en­tifi­cally about how our uni­verse works, and us­ing this un­der­stand­ing to ma­nipu­late the uni­verse. HPMOR at­tempts to teach the fun­da­men­tal skills be­hind this think­ing in three main ways:

      • The first way HPMOR teaches sci­ence is that the reader is given many ex­am­ples of the in­side of some­one’s mind when they are think­ing with the goal of ac­tu­ally un­der­stand­ing the world and are rea­son­ing with the sci­en­tific and quan­ti­ta­tive un­der­stand­ing hu­man­ity has de­vel­oped. HPMOR is a fic­tional work, con­tain­ing a highly de­tailed world with char­ac­ters whose ex­pe­rience a reader em­pathises with and sto­rylines that evoke re­sponses from a reader. The char­ac­ters in HPMOR demon­strate the core skills of quan­ti­ta­tive, sci­en­tific rea­son­ing: form­ing a hy­poth­e­sis, mak­ing a pre­dic­tion, throw­ing out the hy­poth­e­sis when the pre­dic­tion does not match re­al­ity, and oth­er­wise up­dat­ing prob­a­bil­is­ti­cally when they don’t yet have de­ci­sive ev­i­dence.

      • The sec­ond way HPMOR teaches sci­ence is that key sci­en­tific re­sults and mechanisms are wo­ven into the nar­ra­tive of the book. Stud­ies in the heuris­tics and bi­ases liter­a­ture, ge­netic se­lec­tion, pro­gram­ming loops, Bayesian rea­son­ing, and more are all ex­plained in an un­usu­ally nat­u­ral man­ner. They aren’t just added on top of the nar­ra­tive in or­der for there to be sci­ence in the book; in­stead, the story’s uni­verse is in fact con­strained by these the­o­ries in such a way that they are nat­u­rally brought up by char­ac­ters at­tempt­ing to figure out what they should do.

      • This con­tributes to the third way HPMOR helps teach sci­en­tific think­ing: HPMOR is speci­fi­cally de­signed to be un­der­stand­able in ad­vance of the end of the book, and many read­ers have used the think­ing tools taught in the book to do just that. One of the key bot­tle­necks in in­di­vi­d­u­als’ abil­ity to af­fect the long-term fu­ture is the abil­ity to deal with the uni­verse as though it is un­der­stand­able in prin­ci­ple, and HPMOR cre­ates a uni­verse where this is so and in­cludes char­ac­ters do­ing their best to un­der­stand it. This sort of un­der­stand­ing is nec­es­sary for be­ing able to take ac­tions that will have large, in­tended effects on im­por­tant and difficult prob­lems 10^n years down the line.

    • The book also con­tains char­ac­ters who viscer­ally care about hu­man­ity, other con­scious be­ings, and our col­lec­tive long-term fu­ture, and take sig­nifi­cant ac­tions in their own lives to en­sure that this fu­ture goes well.

  • It is fi­nally worth not­ing that HPMOR does all of the above things while also be­ing a highly en­gag­ing book that has been read by hun­dreds of thou­sands of read­ers (if not more) pri­mar­ily for plea­sure. It is the most re­viewed Harry Pot­ter fan fic­tion on fan­fic­tion.net, which is a re­mark­able state of af­fairs.

How good of a tar­get group are Math Olympiad win­ners for these effects?

I think that Math Olympiad win­ners are a very promis­ing de­mo­graphic within which to find in­di­vi­d­u­als who can con­tribute to im­prov­ing the long-term fu­ture. I be­lieve Math Olympiads se­lect strongly on IQ as well as (weakly) on con­scien­tious­ness and cre­ativity, which are all strong pos­i­tives. Par­ti­ci­pants are young and highly flex­ible; they have not yet made too many ma­jor life com­mit­ments (such as which uni­ver­sity they will at­tend), and are in a po­si­tion to use new in­for­ma­tion to sys­tem­at­i­cally change their lives’ tra­jec­to­ries. I view hand­ing them copies of an en­gag­ing book that helps teach sci­en­tific, prac­ti­cal and quan­ti­ta­tive think­ing as a highly asym­met­ric tool for helping them make good de­ci­sions about their lives and the long-term fu­ture of hu­man­ity.

I’ve also vis­ited and par­ti­ci­pated in a va­ri­ety of SPARC events, and found the cul­ture there (which is likely to be at least some­what rep­re­sen­ta­tive of Math Olympiad cul­ture) very healthy in a broad sense. Par­ti­ci­pants dis­played high lev­els of al­tru­ism, a lot of will­ing­ness to help one an­other, and an im­pres­sive amount of am­bi­tion to im­prove their own think­ing and af­fect the world in a pos­i­tive way. Th­ese ob­ser­va­tions make me op­ti­mistic about efforts that build on that cul­ture.

I think it’s im­por­tant when in­ter­act­ing with minors, and at­tempt­ing to im­prove (and thus change) their life tra­jec­to­ries, to make sure to en­gage with them in a safe way that is re­spect­ful of their au­ton­omy and does not put so­cial pres­sures on them in ways they may not yet have learned to cope with. In this situ­a­tion, Mikhail is work­ing with/​through the in­sti­tu­tions that run the IMO and EGMO, and I ex­pect those in­sti­tu­tions to (a) have lots of ex­pe­rience with safe­guard­ing minors and (b) have norms in place to make sure that in­ter­ac­tions with the stu­dents are pos­i­tive.

Is the team com­pe­tent enough to ex­e­cute on their plan?

I don’t have a lot of in­for­ma­tion on the team, don’t know Mikhail, and have not re­ceived any ma­jor strong en­dorse­ment for him and his team, which makes this the weak­est link in the ar­gu­ment. How­ever, I know that they are co­or­di­nat­ing both with SPARC (which also works to give books like HPMOR to similar pop­u­la­tions) and the team be­hind the highly suc­cess­ful Rus­sian print­ing of HPMOR, two teams who have ex­e­cuted this kind of pro­ject suc­cess­fully in the past. So I felt com­fortable recom­mend­ing this grant, es­pe­cially given its rel­a­tively limited down­side.

Alex Turner ($30,000)

Build­ing to­wards a “Limited Agent Foun­da­tions” the­sis on mild op­ti­miza­tion and corrigibility

From the ap­pli­ca­tion:

I am a third-year com­puter sci­ence PhD stu­dent funded by a grad­u­ate teach­ing as­sis­tantship; to ded­i­cate more at­ten­tion to al­ign­ment re­search, I am ap­ply­ing for one or more trimesters of fund­ing (spring term starts April 1).

[…]

Last sum­mer, I de­signed an ap­proach to the “im­pact mea­sure­ment” sub­prob­lem of AI safety: “what equa­tion cleanly cap­tures what it means for an agent to change its en­vi­ron­ment, and how do we im­ple­ment it so that an im­pact-limited pa­per­clip max­i­mizer would only make a few thou­sand pa­per­clips?”. I be­lieve that my ap­proach, At­tain­able Utility Preser­va­tion (AUP), goes a long way to­wards an­swer­ing both ques­tions ro­bustly, con­clud­ing:

> By chang­ing our per­spec­tive from “what effects on the world are ‘im­pact­ful’?” to “how can we stop agents from overfit­ting their en­vi­ron­ments?”, a nat­u­ral, satis­fy­ing defi­ni­tion of im­pact falls out. From this, we con­struct an im­pact mea­sure with a host of de­sir­able prop­er­ties […] AUP agents seem to ex­hibit qual­i­ta­tively differ­ent be­hav­ior […]

Pri­mar­ily, I aim both to out­put pub­lish­able ma­te­rial for my the­sis and to think deeply about the cor­rigi­bil­ity and mild op­ti­miza­tion por­tions of MIRI’s ma­chine learn­ing re­search agenda. Although I’m ex­cited by what AUP makes pos­si­ble, I want to lay the ground­work of deep un­der­stand­ing for mul­ti­ple al­ign­ment sub­prob­lems. I be­lieve that this kind of clear un­der­stand­ing will make pos­i­tive AI out­comes more likely.

My thoughts and reasoning

I’m ex­cited about this be­cause:

  • Alex’s ap­proach to find­ing per­sonal trac­tion in the do­main of AI Align­ment is one that I would want many other peo­ple to fol­low. On LessWrong, he read and re­viewed a large num­ber of math text­books that are use­ful for think­ing about the al­ign­ment prob­lem, and sought pub­lic in­put and feed­back on what things to study and read early on in the pro­cess.

  • He wasn’t in­timi­dated by the com­plex­ity of the prob­lem, but started think­ing in­de­pen­dently about po­ten­tial solu­tions to im­por­tant sub-prob­lems long be­fore he had “com­pre­hen­sively” stud­ied the math­e­mat­i­cal back­ground that is com­monly cited as be­ing the foun­da­tion of AI Align­ment.

  • He wrote up his thoughts and hy­pothe­ses in a clear way, sought feed­back on them early, and ended up mak­ing a set of novel con­tri­bu­tions to an in­ter­est­ing sub-field of AI Align­ment quite quickly (in the form of his work on im­pact mea­sures, on which he re­cently col­lab­o­rated with the Deep­Mind AI Safety team)

Po­ten­tial concerns

Th­ese in­tu­itions, how­ever, are a bit in con­flict with some of the con­crete re­search that Alex has ac­tu­ally pro­duced. My in­side views on AI Align­ment make me think that work on im­pact mea­sures is very un­likely to re­sult in much con­crete progress on what I per­ceive to be core AI Align­ment prob­lems, and I have talked to a va­ri­ety of other re­searchers in the field who share that as­sess­ment. I think it’s im­por­tant that this grant not be viewed as an en­dorse­ment of the con­crete re­search di­rec­tion that Alex is pur­su­ing, but only as an en­dorse­ment of the higher-level pro­cess that he has been us­ing while do­ing that re­search.

As such, I think it was a nec­es­sary com­po­nent of this grant that I have talked to other peo­ple in AI Align­ment whose judg­ment I trust, who do seem ex­cited about Alex’s work on im­pact mea­sures. I think I would not have recom­mended this grant, or at least this large of a grant amount, with­out their en­dorse­ment. I think in that case I would have been wor­ried about a risk of di­vert­ing at­ten­tion from what I think are more promis­ing ap­proaches to AI Align­ment, and a po­ten­tial dilu­tion of the field by in­tro­duc­ing a set of (to me) some­what du­bi­ous philo­soph­i­cal as­sump­tions.

Over­all, while I try my best to form con­crete and de­tailed mod­els of the AI Align­ment re­search space, I don’t cur­rently de­vote enough time to it to build de­tailed mod­els that I trust enough to put very large weight on my own per­spec­tive in this par­tic­u­lar case. In­stead, I am mostly defer­ring to other re­searchers in this space that I do trust, a num­ber of whom have given pos­i­tive re­views of Alex’s work.

In ag­gre­gate, I have a sense that the way Alex went about work­ing on AI Align­ment is a great ex­am­ple for oth­ers to fol­low, I’d like to see him con­tinue, and I am ex­cited about the LTF Fund giv­ing out more grants to oth­ers who try to fol­low a similar path.

Or­pheus Lum­mis ($10,000)

Up­skil­ling in con­tem­po­rary AI tech­niques, deep RL and AI safety, be­fore pur­su­ing a ML PhD

From the ap­pli­ca­tion :

Notable planned sub­pro­jects:

My thoughts and reasoning

We funded Or­pheus in our last grant round to run an AI Safety Un­con­fer­ence just af­ter NeurIPS. We’ve got­ten pos­i­tive tes­ti­mo­ni­als from the event, and I am over­all happy about that grant.

I do think that of the grants I recom­mended this round, this is prob­a­bly the one I feel least con­fi­dent about. I don’t know Or­pheus very well, and while I have re­ceived gen­er­ally pos­i­tive re­views of their work, I haven’t yet had the time to look into any of those re­views in de­tail, and haven’t seen clear ev­i­dence about the qual­ity of their judg­ment. How­ever, what I have seen seems pretty good, and if I had even a tiny bit more time to spend on eval­u­at­ing this round’s grants, I would prob­a­bly have spent it reach­ing out to Or­pheus and talk­ing with them more in per­son.

In gen­eral, I think time for self-study and re­flec­tion can be ex­cep­tion­ally im­por­tant for peo­ple start­ing to work in AI Align­ment. This is par­tic­u­larly true if they are fol­low­ing a more con­ven­tional aca­demic path which could eas­ily cause them to try to im­me­di­ately work on con­tem­po­rary AI ca­pa­bil­ities re­search, be­cause I gen­er­ally think this has nega­tive value even for peo­ple con­cerned about safety (though I do have some un­cer­tainty here). I think giv­ing peo­ple work­ing on more clas­si­cal ML re­search the time and re­sources to ex­plore the broader im­pli­ca­tions of their work on safety, if they are already in­ter­ested in that, is a good use of re­sources.

I am also ex­cited about build­ing out the Mon­treal AI Align­ment com­mu­nity, and hav­ing some­one who both has the time and skills to or­ga­nize events and can un­der­stand the tech­ni­cal safety work seems likely to have good effects.

This grant is also the small­est grant we are fund­ing this round, mak­ing me more com­fortable with a bit less due dili­gence than the other grants, es­pe­cially since this grant seems un­likely to have any large nega­tive con­se­quences.

Te­gan McCaslin ($30,000)

Con­duct­ing in­de­pen­dent re­search into AI fore­cast­ing and strat­egy questions

From the ap­pli­ca­tion:

1) I’d like to in­de­pen­dently pur­sue re­search pro­jects rele­vant to AI fore­cast­ing and strat­egy, in­clud­ing (but not nec­es­sar­ily limited to) some of the fol­low­ing:

I am ac­tively pur­su­ing op­por­tu­ni­ties to work with or un­der more se­nior AI strat­egy re­searchers [..], so my re­search fo­cus within AI strat­egy is likely to be in­fluenced by who ex­actly I end up work­ing with. Other­wise I ex­pect to spend some short pe­riod of time at the start gen­er­at­ing more re­search ideas and con­duct­ing pi­lot tests on the or­der of sev­eral hours into their tractabil­ity, then choos­ing which to pur­sue based on an im­por­tance/​tractabil­ity/​ne­glect­ed­ness frame­work.

[..]

2) There are rel­a­tively few re­searchers ded­i­cated full-time to in­ves­ti­gat­ing AI strat­egy ques­tions that are not im­me­di­ately policy-rele­vant. How­ever, there nonethe­less ex­ists room to con­tribute to the re­search on ex­is­ten­tial risks from AI with ap­proaches that fit into nei­ther tech­ni­cal AI safety nor AI policy/​gov­er­nance buck­ets.

My thoughts and reasoning

Te­gan has been a mem­ber of the X-risk net­work for sev­eral years now, and re­cently left AI Im­pacts. She is now look­ing for work as a re­searcher. Two con­sid­er­a­tions made me want to recom­mend that the LTF Fund make a grant to her.

  1. It’s eas­ier to re­lo­cate some­one who has already demon­strated trust and skills than to find some­one com­pletely new.

    1. This is (roughly) ad­vice given by YCom­bi­na­tor to star­tups, and I think it’s rele­vant to the X-risk com­mu­nity. It’s cheaper for Te­gan to move around and find the place for her to do her best work rel­a­tive to an out­sider who has not already worked within the X-risk net­work. A similarly skil­led in­di­vi­d­ual who is not already part of the net­work will need to spend a few years un­der­stand­ing the com­mu­nity and demon­strat­ing that they can be trusted. So I think it is a good idea to help Te­gan ex­plore other parts of the com­mu­nity to work in.

  2. It’s im­por­tant to give good re­searchers run­way while they find the right place.

    1. For many years, the X-risk com­mu­nity has been fund­ing-bot­tle­necked, keep­ing salaries low. A lot of progress has been made on this front and I hope that we’re able to fix this. Un­for­tu­nately, the cur­rent situ­a­tion means that when a hire does not work out, the in­di­vi­d­ual of­ten doesn’t have much run­way while re­ori­ent­ing, up­dat­ing on what didn’t work out, and sub­se­quently tri­al­ing at other or­ga­ni­za­tions.

    2. This moves them much more quickly into an emer­gency mode, where ev­ery­thing must be op­ti­mized for short-term in­come, rather than long-term up­dat­ing, skill build­ing, and re­search. As such, I think it is im­por­tant for Te­gan to have a com­fortable amount of run­way while do­ing solo re­search and tri­al­ling at var­i­ous or­ga­ni­za­tions in the com­mu­nity.

While I haven’t spent the time to look into Te­gan’s re­search in any depth, the small amount I did read looked promis­ing. The method­ol­ogy of this post is quite ex­cit­ing, and her work there and on other pieces seems very thor­ough and de­tailed.

That said, my brief as­sess­ment of Te­gan’s work was not the rea­son why I recom­mended this grant, and if Te­gan asks for a new grant in 6 months to fo­cus on solo re­search, I will want to spend sig­nifi­cantly more time read­ing her out­put and talk­ing with her, to un­der­stand how these ques­tions were cho­sen and what pre­cise re­la­tion they have to fore­cast­ing tech­nolog­i­cal progress in AI.

Over­all, I think Te­gan is in a good place to find a valuable role in our col­lec­tive X-risk re­duc­tion pro­ject, and I’d like her to have the run­way to find that role.

An­thony Aguirre ($70,000)

A ma­jor ex­pan­sion of the Me­tac­u­lus pre­dic­tion plat­form and its community

From the ap­pli­ca­tion:

The funds would be used to ex­pand the Me­tac­u­lus pre­dic­tion plat­form along with its com­mu­nity. Me­tac­u­lus.com is a fully-func­tional pre­dic­tion plat­form with ~10,000 reg­istered users and >120,000 pre­dic­tions made to date on more than >1000 ques­tions. The goals of Me­tac­u­lus are:

  • Short-term: Provide a re­source to sci­ence, tech, and (es­pe­cially) EA-re­lated com­mu­ni­ties already in­ter­ested in gen­er­at­ing, ag­gre­gat­ing, and em­ploy­ing ac­cu­rate pre­dic­tions, and train­ing to be bet­ter pre­dic­tors.

  • Medium-term: Im­prove de­ci­sion-mak­ing by in­di­vi­d­u­als and groups by pro­vid­ing well-cal­ibrated nu­mer­i­cal pre­dic­tions.

  • Long-term: en­courage and back­stop a wide­spread cul­ture of ac­countable and ac­cu­rate pre­dic­tions and sce­nario plan­ning.

There are two ma­jor high-pri­or­ity ex­pan­sions pos­si­ble with fund­ing in place. The first would be an in­te­grated set of ex­ten­sions to im­prove user in­ter­ac­tion and in­for­ma­tion-shar­ing. This would in­clude pri­vate mes­sag­ing and no­tifi­ca­tions, pri­vate groups, a pre­dic­tion “fol­low­ing” sys­tem to cre­ate micro-teams within in­di­vi­d­ual ques­tions, and var­i­ous in­cen­tives and sys­tems for in­for­ma­tion-shar­ing.

The sec­ond ex­pan­sion would link ques­tions into a net­work. Users would ex­press links be­tween ques­tions, from very sim­ple (“no­tify me re­gard­ing ques­tion Y when P(X) changes sub­stan­tially) to more com­plex (“Y hap­pens only if X hap­pens, but not con­versely”, etc.) In­for­ma­tion can also be gleaned from what users ac­tu­ally do. The strength and char­ac­ter of these re­la­tions can then gen­er­ate differ­ent graph­i­cal mod­els that can be ex­plored in­ter­ac­tively, with the ul­ti­mate goal of a crowd-sourced quan­ti­ta­tive graph­i­cal model that could struc­ture event re­la­tions and prop­a­gate new in­for­ma­tion through the net­work.

My thoughts and reasoning

For this grant, and also the grants to Ozzie Gooen and Ja­cob Lager­ros, I did not have enough time to write up my gen­eral thoughts on fore­cast­ing plat­forms and com­mu­ni­ties. I hope to later write a post with my thoughts here. But for a short sum­mary, see my thoughts on Ozzie Gooen’s grant.

I am gen­er­ally ex­cited about peo­ple build­ing plat­forms for co­or­di­nat­ing in­tel­lec­tual la­bor, par­tic­u­larly on top­ics that are highly rele­vant to the long-term fu­ture. I think Me­tac­u­lus has been pro­vid­ing a valuable ser­vice for the past few years, both in im­prov­ing our col­lec­tive abil­ity to fore­cast a large va­ri­ety of im­por­tant world events and in al­low­ing peo­ple to train and demon­strate their fore­cast­ing skills, which I ex­pect to be­come more rele­vant in the fu­ture.

I am broadly im­pressed with how co­op­er­a­tive and re­spon­sive the Me­tac­u­lus team has been in helping or­ga­ni­za­tions in the X-risk space get an­swers to im­por­tant ques­tions, or provide soft­ware ser­vices to them (e.g. I know that they are helping Ja­cob Lager­ros and Ben Gold­haber set up a pri­vate Me­tac­u­lus in­stance fo­cused on AI)

I don’t know An­thony well, and over­all I am quite con­cerned that there is no full-time per­son on this pro­ject. My model is that pro­jects like this tend to go a lot bet­ter if they have one core cham­pion who has the re­sources to fully ded­i­cate them­selves to the pro­ject, and it cur­rently doesn’t seem that An­thony is able to do that.

My cur­rent model is that Me­tac­u­lus will strug­gle as a plat­form with­out a fully ded­i­cated team or at least in­di­vi­d­ual cham­pion, though I have not done a thor­ough in­ves­ti­ga­tion of the Me­tac­u­lus team and pro­ject, so I am not very con­fi­dent of this. One of the ma­jor mo­ti­va­tions for this grant is to en­sure that Me­tac­u­lus has enough re­sources to hire a po­ten­tial new cham­pion for the pro­ject (who ideally also has pro­gram­ming skills or UI de­sign skills to al­low them to di­rectly work on the plat­form). That said, Me­tac­u­lus should use the money as best they see fit.

I am also con­cerned about the over­lap of Me­tac­u­lus with the Good Judg­ment Pro­ject, and cur­rently have a sense that it suffers from be­ing in com­pe­ti­tion with it, while also hav­ing ac­cess to sub­stan­tially fewer re­sources and peo­ple.

The re­quested grant amount was for $150k, but I am cur­rently not con­fi­dent enough in this grant to recom­mend filling the whole amount. If Me­tac­u­lus finds an in­di­vi­d­ual new cham­pion for the pro­ject, I can imag­ine strongly recom­mend­ing that it gets fully funded, if the new cham­pion seems com­pe­tent.

Lau­ren Lee ($20,000)

Work­ing to pre­vent burnout and boost pro­duc­tivity within the EA and X-risk communities

From the ap­pli­ca­tion:

(1) After 2 years as a CFAR in­struc­tor/​re­searcher, I’m cur­rently in a 6-12 month phase of re­ori­ent­ing around my goals and plans. I’m re­quest­ing a grant to spend the com­ing year think­ing about ra­tio­nal­ity and test­ing new pro­jects.

(2) I want to help in­di­vi­d­u­als and orgs in the x-risk com­mu­nity ori­ent to­wards and achieve their goals.

(A) I want to train the skill of de­pend­abil­ity, in my­self and oth­ers.

This is the skill of a) fol­low­ing through on com­mit­ments and b) mak­ing proso­cial /​ difficult choices in the face of fear and aver­sion. The skill of do­ing the cor­rect thing, de­spite go­ing against in­cen­tive gra­di­ents, seems to be the key to virtue.

One strat­egy I’ve used is to sur­round my­self with peo­ple with shared val­ues (CFAR, Bay Area) and trust the re­sult­ing in­cen­tive gra­di­ents. I now be­lieve it is also crit­i­cal to be the kind of per­son who can take cor­rect ac­tion de­spite pre­vailing in­cen­tive struc­tures.

Depend­abil­ity is also re­lated to think­ing clearly. Your abil­ity to make the right de­ci­sion de­pends on your abil­ity to hold and be with all pos­si­ble re­al­ities, es­pe­cially painful and aver­sive ones. Most peo­ple have blindspots that ac­tively pre­vent this.

I have some leads on how to train this skill, and I’d like both time and money to test them.

(B) Think­ing clearly about AI risk

Most peo­ple’s de­ci­sions in the Bay Area AI risk com­mu­nity seem model-free. They them­selves don’t have mod­els of why they’re do­ing what they’re do­ing; they’re rely­ing on other peo­ple “with mod­els” to tell them what to do and why. I’ve per­son­ally car­ried around such premises. I want to help peo­ple ex­plore where their ‘place­holder premises’ are and cre­ate safety for look­ing at their true mo­ti­va­tions, and then help them be­come more in­ter­nally and ex­ter­nally al­igned.

(C) Burnout

Speak­ing of “not get­ting very far.” My per­sonal opinion is that most ex-CFAR em­ploy­ees left be­cause of burnout; I’ve writ­ten what I’ve learned here, see top 2 com­ments: [https://​​fo­rum.effec­tivealtru­ism.org/​​posts/​​NDszJWMs­dLCB4MNoy/​​burnout-what-is-it-and-how-to-treat-it#87ue5WzwaFDbGpcA7]. I’m in­ter­ested in work­ing with orgs and in­di­vi­d­u­als to pre­vent burnout proac­tively.

(3) Some pos­si­ble mea­surable out­puts /​ ar­ti­facts:

  • A pro­gram where I do 1-on-1 ses­sions with in­di­vi­d­u­als or orgs; I’d cre­ate re­ports based on whether they self-re­port improvements

  • X-risk orgs (e.g. FHI, MIRI, OpenPhil, BERI, etc.) de­cid­ing to spend time/​money on my ser­vices may be a pos­i­tive in­di­ca­tor, as they tend to be thought­ful with how they spend their resources

  • Writ­ings or talks

  • Work­shops with feed­back forms

  • A more effec­tive ver­sion of my­self (no­table changes = gain­ing the abil­ity to ride a bike /​ drive a car /​ ex­er­cise—a PTSD-re­lated dis­abil­ity, abil­ity to finish pro­jects to com­ple­tion, oth­ers notic­ing stark changes in me)

My thoughts and reasoning

Lau­ren worked as an in­struc­tor at CFAR for about 2 years, un­til Fall 2018. I re­view CFAR’s im­pact as an in­sti­tu­tion be­low; in gen­eral, I be­lieve it has helped set a strong epistemic foun­da­tion for the com­mu­nity and been suc­cess­ful in re­cruit­ment and train­ing. I have a great ap­pre­ci­a­tion for ev­ery­one who helps them with their work.

Lau­ren is cur­rently in a pe­riod of re­flec­tion and re­ori­en­ta­tion around her life and the prob­lem of AGI, in part due to ex­pe­rienc­ing burnout in the months be­fore she left CFAR. To my knowl­edge, CFAR has never been well-funded enough to offer high salaries to its em­ploy­ees, and I think it is valuable to en­sure that peo­ple who work at EA orgs and burn out have the sup­port to take the time for self-care af­ter quit­ting due to long-term stress. Ideally, I think this should be im­proved by higher salaries that al­low em­ploy­ees to build sig­nifi­cant run­way to deal with shocks like this, but I think that the cur­rent equil­ibrium of salary lev­els in EA does not make that easy. Over­all, I think it’s likely that staff at highly valuable EA orgs will con­tinue burn­ing out, and I don’t cur­rently see it as an achiev­able tar­get to not have this hap­pen (though I am in fa­vor of peo­ple peo­ple work­ing on solv­ing the prob­lem).

I do not know Lau­ren well enough to eval­u­ate the qual­ity of her work on the art of hu­man ra­tio­nal­ity, but mul­ti­ple peo­ple I trust have given pos­i­tive re­views (e.g. see Alex Zhu above), so I am also in­ter­ested to read her out­put on the sub­jects she is think­ing about.

I think it’s very im­por­tant that peo­ple who work on de­vel­op­ing an un­der­stand­ing of hu­man ra­tio­nal­ity take the time to add their knowl­edge into our col­lec­tive un­der­stand­ing, so that oth­ers can benefit from and build on top of it. Lau­ren has be­gun to write up her thoughts on top­ics like burnout, in­ten­tions, de­pend­abil­ity, cir­cling, and cu­ri­os­ity, and her hav­ing the space to con­tinue to write up her ideas seemed like a sig­nifi­cant ad­di­tional pos­i­tive out­come of this grant.

I think that she should prob­a­bly aim to make what­ever she does valuable enough that in­di­vi­d­u­als and or­ga­ni­za­tions in the com­mu­nity wish to pay her di­rectly for her work. It’s un­likely that I would recom­mend re­new­ing this grant for an­other 6 month pe­riod in the ab­sence of a rel­a­tively ex­cit­ing new re­search pro­ject/​di­rec­tion, and if Lau­ren were to reap­ply, I would want to have a much stronger sense that the pro­jects she was work­ing on were pro­duc­ing lots of value be­fore I de­cided to recom­mend fund­ing her again.

In sum, this grant hope­fully helps Lau­ren to re­cover from burn­ing out, get the new ra­tio­nal­ity pro­jects she is work­ing on off the ground, po­ten­tially iden­tify a good new niche for her to work in (alone or at an ex­ist­ing or­ga­ni­za­tion), and write up her ideas for the com­mu­nity.

Ozzie Gooen ($70,000)

Build in­fras­truc­ture for the fu­ture of effec­tive fore­cast­ing efforts

From the ap­pli­ca­tion:

What I will do

I ap­plied a few months ago and was granted $20,000 (thanks!). My pur­pose for this money is similar but greater in scope to the pre­vi­ous round. The pre­vi­ous fund­ing has given me the se­cu­rity to be more am­bi­tious, but I’ve re­al­ized that ad­di­tional guaran­tees of fund­ing should help sig­nifi­cantly more. In par­tic­u­lar, en­g­ineers can be costly and it would be use­ful to se­cure ad­di­tional fund­ing in or­der to give pos­si­ble hires se­cu­rity.

My main over­all goal is to ad­vance the use of pre­dic­tive rea­son­ing sys­tems for pur­poses most use­ful for Effec­tive Altru­ism. I think this is an area that could even­tu­ally make use of a good deal of tal­ent, so I have come to see my work at this point as foun­da­tional.

This work is in a few differ­ent ar­eas that I think could be valuable. I ex­pect that af­ter a while a few parts will emerge as the most im­por­tant, but think it is good to ex­per­i­ment early when the most effec­tive route is not yet clear.

I plan to use ad­di­tional funds to scale my gen­eral re­search and de­vel­op­ment efforts. I ex­pect that most of the money will be used on pro­gram­ming efforts.

Foretold

Fore­told is a fore­cast­ing ap­pli­ca­tion that han­dles full prob­a­bil­ity dis­tri­bu­tions. I have be­gun test­ing it with users and have been asked for quite a bit more func­tion­al­ity. I’ve also mapped out the fea­tures that I ex­pect peo­ple will even­tu­ally de­sire, and think there is a sig­nifi­cant amount of work that would be sig­nifi­cantly use­ful.

One par­tic­u­lar challenge is figur­ing out the best way to han­dle large num­bers of ques­tions (1000 ac­tive ques­tions plus, at a time.) I be­lieve this re­quires sig­nifi­cant in­no­va­tions in the user in­ter­face and back­end ar­chi­tec­ture. I’ve made some wire­frames and have ex­per­i­mented with differ­ent meth­ods, and be­lieve I have a prag­matic path for­ward, but will need to con­tinue to iter­ate.

I’ve talked with mem­bers of mul­ti­ple or­ga­ni­za­tions at this point who would like to use Fore­told once it has a spe­cific set of fea­tures, and can­not cur­rently use any ex­ist­ing sys­tem for their pur­poses. […]

Ken

Ken is a pro­ject to help or­ga­ni­za­tions set up and work with struc­tured data, in essence al­low­ing them to have pri­vate ver­sions of Wik­i­data. Part of the pro­ject is Ken.js, a library which I’m be­gin­ning to in­te­grate with Fore­told.

Ex­pected Impact

The main aim of EA fore­cast­ing would be to bet­ter pri­ori­tize EA ac­tions. I think that if we could have a pow­er­ful sys­tem set up, it could make us bet­ter at pre­dict­ing the fu­ture, bet­ter at un­der­stand­ing what things are im­por­tant and bet­ter at com­ing to a con­sen­sus on challeng­ing top­ics.

Measurement

In the short term, I’m us­ing heuris­tics like met­rics re­gard­ing user ac­tivity and up­votes on LessWrong. I’m also get­ting feed­back by many peo­ple in the EA re­search com­mu­nity. In the medium to long term, I hope to set up eval­u­a­tion/​es­ti­ma­tion pro­ce­dures for many pro­jects and would in­clude this one in that pro­cess.

My thoughts and reasoning

This grant is to sup­port Ozzie Gooen in his efforts to build in­fras­truc­ture for effec­tive fore­cast­ing. Ozzie re­quested $70,000 to hire a soft­ware en­g­ineer who would sup­port him on his work on the pre­dic­tion plat­form www.fore­told.iothat he is work­ing on.

  • When think­ing about how to im­prove the long-term fu­ture, I think we are con­fused about what counts as progress and what spe­cific prob­lems need solv­ing. We can already see that there are a lot of tech­ni­cal and con­cep­tual prob­lems that have to be solved to make progress on a lot of the big prob­lems we think are im­por­tant.

  • I think that in or­der to make effec­tive in­tel­lec­tual progress, you need some way for many peo­ple to col­lab­o­rate on solv­ing prob­lems and to doc­u­ment the progress they have made so far.

  • I think there is po­ten­tially a lot of low-hang­ing fruit in de­sign­ing bet­ter on­line plat­forms for mak­ing in­tel­lec­tual progress (which is why I chose to work on LessWrong + AI Align­ment Fo­rum + EA Fo­rum). Ozzie works in this space too, and pre­vi­ously built Guessti­mate (a spread­sheet where ev­ery cell is a prob­a­bil­ity dis­tri­bu­tion), which I think dis­played some real in­no­va­tion in the way we can use tech­nol­ogy to com­mu­ni­cate and clar­ify ideas. It was also pro­duced to a very high stan­dard of qual­ity.

  • Fore­cast­ing plat­forms in par­tic­u­lar have already dis­played sig­nifi­cant promise and tractabil­ity, with re­cent work by Philip Tet­lock show­ing that a sim­ple pre­dic­tion plat­form can out­perform ma­jor gov­ern­men­tal in­sti­tu­tions like the CIA, and older work by Robin Han­son, show­ing ways that pre­dic­tion mar­kets could help us make progress on a num­ber of in­ter­est­ing prob­lems.

  • The biggest con­cerns I have with Ozzie’s work, as well as the work on other pre­dic­tion and ag­gre­ga­tion plat­forms, is that the prob­lem of get­ting peo­ple to ac­tu­ally use the product turns out to be very hard. Matt Fal­lshaw’s team at Trike Apps built https://​​pre­dic­tion­book.com/​​, but then found it hard to get peo­ple to ac­tu­ally use it. Ozzie’s last pro­ject, Guessti­mate, seemed quite well-ex­e­cuted, but similarly faltered due to low user num­bers and a lack of in­ter­est from po­ten­tial cus­tomers in in­dus­try. As such, I think it’s im­por­tant not to un­der­es­ti­mate the difficulty of mak­ing the product good enough that peo­ple ac­tu­ally use it.

  • I do think that the road to build­ing knowl­edge ag­gre­ga­tion plat­forms will in­clude many failed pro­jects and many ex­per­i­ments that never get trac­tion; as such, I do think that one should not over-up­date on the lack of users for some of the ex­ist­ing plat­forms. As a pos­i­tive coun­terex­am­ple, the Good Judg­ment Pro­ject seems to have a con­sis­tently high num­ber of peo­ple mak­ing pre­dic­tions.

  • I’ve also fre­quently in­ter­acted with Ozzie in per­son, and gen­er­ally found his rea­son­ing and judg­ment in this do­main to be good. I also think it is quite good that he has been writ­ing up his think­ing for the com­mu­nity to read and en­gage with, which will al­low other peo­ple to build off of his think­ing and efforts, even if he doesn’t find trac­tion with this par­tic­u­lar pro­ject.

Jo­hannes Hei­decke ($25,000)

Sup­port­ing as­piring re­searchers of AI al­ign­ment to boost them­selves into productivity

From the ap­pli­ca­tion:

(1) We would like to ap­ply for a grant to fund an up­com­ing camp in Madrid that we are or­ga­niz­ing. The camp con­sists of sev­eral weeks of on­line col­lab­o­ra­tion on con­crete re­search ques­tions, cul­mi­nat­ing in a 9-day in­ten­sive in-per­son re­search camp. Par­ti­ci­pants will work in groups on tightly-defined re­search pro­jects in strat­egy and tech­ni­cal AI safety. Ex­pert ad­vi­sors from AI Safety/​Strat­egy or­ga­ni­za­tions will help re­fine pro­pos­als to be tractable and rele­vant. This al­lows for time-effi­cient use of ad­vi­sors’ knowl­edge and re­search ex­pe­rience, and en­sures that re­search is well-al­igned with cur­rent pri­ori­ties. More in­for­ma­tion: https://​​aisafe­ty­camp.com/​​

(2) The field of AI al­ign­ment is tal­ent-con­strained, and while there is a sig­nifi­cant num­ber of young as­piring re­searchers who con­sider fo­cussing their ca­reer on re­search on this topic, it is of­ten very difficult for them to take the first steps and be­come pro­duc­tive with con­crete and rele­vant pro­jects. This is par­tially due to es­tab­lished re­searchers be­ing time-con­strained and not hav­ing time to su­per­vise a large num­ber of stu­dents. The goals of AISC are to help a rel­a­tively large num­ber of high-tal­ent peo­ple to take their first con­crete steps in re­search on AI safety, con­nect them to col­lab­o­rate, and effi­ciently use the ca­pac­i­ties of ex­pe­rienced re­searchers to guide them on their path.

(3) We send out eval­u­a­tion ques­tion­naires di­rectly af­ter the camp and in reg­u­lar in­ter­vals af­ter the camp has passed. We mea­sure im­pact on ca­reer de­ci­sions and col­lab­o­ra­tions and keep track of con­crete out­put pro­duced by the teams, such as blog posts or pub­lished ar­ti­cles.

We have suc­cess­fully or­ga­nized two camps be­fore and are in the prepa­ra­tion phase for the third camp tak­ing place in April 2019 near Madrid. I was the main or­ga­nizer for the sec­ond camp and am ad­vis­ing the core team of the cur­rent camp, as well as or­ga­niz­ing fund­ing.

An overview of pre­vi­ous re­search pro­jects from the first 2 camps can be found here:

https://​​aisafe­ty­camp.com/​​2018/​​06/​​05/​​aisc-1-re­search-sum­maries/​

https://​​aisafe­ty­camp.com/​​2018/​​12/​​07/​​aisc2-re­search-sum­maries/​​

We have eval­u­ated the feed­back from par­ti­ci­pants of the first two camps in the fol­low­ing two doc­u­ments:

https://​​docs.google.com/​​doc­u­ment/​​d/​​1f8wvsvQTv4wdBag­gCaK8aKC5gFdIHUD­cih­n­mVkZPM6I/​​edit?usp=sharing

https://​​docs.google.com/​​doc­u­ment/​​d/​​18v2e-S3iZrOPbE7d9n26sUs1K6CkUAvezRvRj_xlcj8/​​edit?usp=sharing

My thoughts and reasoning

I’ve talked with var­i­ous par­ti­ci­pants of past AI Safety camps and heard broadly good things across the board. I also gen­er­ally have a pos­i­tive im­pres­sion of the peo­ple in­volved, though I don’t know any of the or­ga­niz­ers very well.

The ma­te­rial and tes­ti­mo­ni­als that I’ve seen so far sug­gest that the camp suc­cess­fully points par­ti­ci­pants to­wards a tech­ni­cal ap­proach to AI Align­ment, fo­cus­ing on rigor­ous rea­son­ing and clear ex­pla­na­tions, which seems good to me.

I am not re­ally sure whether I’ve ob­served sig­nifi­cant pos­i­tive out­comes of camps in past years, though this might just be be­cause I am less con­nected to the Euro­pean com­mu­nity these days.

I also have a sense that there is a lack of op­por­tu­ni­ties for peo­ple in Europe to pro­duc­tively work on AI Align­ment re­lated prob­lems, and so I am par­tic­u­larly in­ter­ested in in­vest­ing in in­fras­truc­ture and events there. This does how­ever make this a higher-risk grant, since I think this means this event and the peo­ple sur­round­ing it might be­come the main lo­ca­tion for AI Align­ment in Europe, and if the qual­ity of the event and the peo­ple sur­round­ing it isn’t high enough, this might cause long-term prob­lems for the AI Align­ment com­mu­nity in Europe.

Concerns

  • I think or­ga­niz­ing long in-per­son events is hard, and con­flict can eas­ily have out­sized nega­tive effects. The re­views that I read from past years sug­gest that in­ter­per­sonal con­flict nega­tively af­fected many par­ti­ci­pants. Learn­ing how to deal with con­flict like this is difficult. The or­ga­niz­ers seem to have con­sid­ered this and thought a lot about it, but the most likely way I ex­pect this grant to have large nega­tive con­se­quences is still if there is some kind of con­flict at the camp that re­sults in more se­ri­ous prob­lems.

  • I think it’s in­evitable that some peo­ple won’t get along with or­ga­niz­ers or other par­ti­ci­pants at the camp for cul­tural rea­sons. If that hap­pens, I think it’s im­por­tant for these peo­ple to have some other way of get­ting con­nected to peo­ple work­ing on AI Align­ment. I don’t know the best way to ar­range this, but I would want the or­ga­niz­ers to think about ways to achieve it.

I also co­or­di­nated with Ni­cole Ross from CEA’s EA Grants pro­ject, who had con­sid­ered also mak­ing a grant to the camp. We de­cided it would be bet­ter for the LTF Fund team to make this grant, though we wanted to make sure that some of the con­cerns Ni­cole had with this grant were sum­ma­rized in our an­nounce­ment:

  • AISC could po­ten­tially turn away peo­ple who would be very good for AI Safety or EA, if those peo­ple have nega­tive in­ter­ac­tions at the camp or if they are much more tal­ented than other par­ti­ci­pants (and there­fore de­velop a low opinion of AI Safety and/​or EA).

  • Some nega­tive in­ter­ac­tions with peo­ple at the camp could, as with all res­i­den­tial pro­grams, lead to harm and/​or PR is­sues, (for ex­am­ple, if some­one at the camp were sex­u­ally ha­rassed). Be­ing able to han­dle such is­sues thought­fully and care­fully is a hard task, and ad­di­tional sup­port or ad­vice may be benefi­cial.

This seems to roughly mir­ror my con­cerns above.

I would want to en­gage with the or­ga­niz­ers a fair bit more be­fore recom­mend­ing a re­newal of this grant, but I am happy about the pro­ject as a space for Euro­peans to get en­gaged with al­ign­ment ideas and work on them for a week to­gether with other tech­ni­cal and en­gaged peo­ple.

Broadly, the effects of the camp seem very likely to be pos­i­tive, while the (fi­nan­cial) cost of the camp seems small com­pared to the ex­pected size of the im­pact. This makes me rel­a­tively con­fi­dent that this grant is a good bet.

Vy­ach­es­lav Matyuhin ($50,000)

An offline com­mu­nity hub for ra­tio­nal­ists and EAs

From the ap­pli­ca­tion:

Our team is work­ing on the offline com­mu­nity hub for ra­tio­nal­ists and EAs in Moscow called Kocherga (de­tails on Kocherga are here).

We want to make sure it keeps ex­ist­ing and grows into the work­ing model for build­ing new flour­ish­ing lo­cal EA com­mu­ni­ties around the globe.

Our key as­sump­tions are:

  1. There’s a gap be­tween the “monthly meetup” EA com­mu­ni­ties and the larger (and sig­nifi­cantly more pro­duc­tive/​im­por­tant) com­mu­ni­ties. That gap is hard to close for many rea­sons.

  2. Solv­ing this is­sue sys­tem­at­i­cally would add a lot of value to the global EA move­ment and, as a con­se­quence, the long-term fu­ture of hu­man­ity.

  3. Clos­ing the gap re­quires a lot of in­fras­truc­ture, both or­ga­ni­za­tional and tech­nolog­i­cal.

So we work on build­ing such an in­fras­truc­ture. We also keep in mind the al­ign­ment and good­hart­ing is­sues (build­ing a big com­mu­nity of peo­ple who call them­selves EAs but who don’t ac­tu­ally share EA virtues would be bad, ob­vi­ously).

[..]

Con­cretely, we want to:

  1. Add 2 more peo­ple to our team.

  2. Im­ple­ment our new com­mu­nity build­ing strat­egy (which in­cludes both or­ga­ni­za­tional tasks such as new events and pro­cesses for seed­ing new work­ing groups, and tech­nolog­i­cal tasks such as im­ple­ment­ing a web­site which al­lows peo­ple from the com­mu­nity to an­nounce new pri­vate mee­tups or team up for coach­ing or mas­ter­mind groups)

  3. Im­prove our ra­tio­nal­ity work­shops (in terms of scale and con­tent qual­ity). Work­shops are im­por­tant for at­tract­ing new com­mu­nity mem­bers, for keep­ing the high epistemic stan­dards of the com­mu­nity and for mak­ing sure that com­mu­nity mem­bers can be as pro­duc­tive as pos­si­ble.

To be able to do this, we need to cover our cur­rent ex­penses some­how un­til we be­come prof­itable on our own.

My thoughts and reasoning

The Rus­sian ra­tio­nal­ity com­mu­nity is sur­pris­ingly big, which sug­gests both a cer­tain level of com­pe­tence from some of its core or­ga­niz­ers and po­ten­tial op­por­tu­ni­ties for more com­mu­nity build­ing. The com­mu­nity has:

  • Suc­cess­fully trans­lated The Se­quences and HPMOR into Rus­sian, as can be seen at the helpful LessWrong.ru site.

  • Ex­e­cuted a suc­cess­ful kick­starter cam­paign to dis­tribute phys­i­cal copies of HPMOR (over 7,000 copies).

  • Built a com­mu­nity hub in Moscow called Kocherga, which is a fi­nan­cially self-sus­tain­ing anti-cafe (a cafe where you pay for time spent there rather than drinks/​snacks) that hosts a va­ri­ety of ra­tio­nal­ity events for roughly 100 at­ten­dees per week.

This grant is to the team that runs the Kocherga anti-cafe.

Their LessWrong write-up sug­gests:

  • They have good skills at build­ing spaces, run­ning events, and gen­er­ally pre­serv­ing their cul­ture while still be­ing fi­nan­cially sustainable

  • They’ve seen steady in­creases over time in available fund­ing and attendees

  • They’ve suc­ceeded at be­ing largely self-suffi­cient for 4 years

  • They’ve suc­cess­fully en­gaged with other lo­cal in­tel­lec­tual communities

  • Their cul­ture seems to value care­ful think­ing and good dis­course a lot, and they seem to have put se­ri­ous effort into de­vel­op­ing the art of ra­tio­nal­ity, in­clud­ing car­ing about the tech­ni­cal as­pects and in­cor­po­rat­ing CFAR’s work into their thinking

I find my­self hav­ing slightly con­flicted feel­ings about the Rus­sian ra­tio­nal­ity com­mu­nity try­ing to iden­tify and in­te­grate more with the EA com­mu­nity. I think a ma­jor pre­dic­tor of how ex­cited I have his­tor­i­cally been about com­mu­nity build­ing efforts has been a group’s em­pha­sis on im­prov­ing mem­bers’ judge­ment and think­ing skills, as well as the de­gree to which it em­pha­sizes high epistemic stan­dards and care­ful think­ing. I am quite ex­cited about how Kocherga seems to have fo­cused on those is­sues so far, and I am wor­ried that this in­te­gra­tion and change of iden­tity will re­duce that fo­cus (as I think it has for some lo­cal and stu­dent groups that made a similar tran­si­tion). That said, I think the Kocherga group has shown quite good judge­ment on this di­men­sion (see here), which ad­dresses many of my con­cerns, though I am still in­ter­ested in think­ing and talk­ing about these is­sues fur­ther.

I’m some­what con­cerned that I’m not aware of any ma­jor in­sights or un­usu­ally tal­ented peo­ple from this com­mu­nity, but I ex­pect the lan­guage bar­rier to be a big part of what is pre­vent­ing me from hear­ing about those things. And I am some­what con­fused about how to ac­count for in­ter­est­ing ideas that don’t spread to the pro­jects I care most about.

I think there are benefits to hav­ing an ac­tive Rus­sian com­mu­nity that can take op­por­tu­ni­ties that are only available for peo­ple in Rus­sia, or at least peo­ple who speak Rus­sian. This par­tic­u­larly ap­plies to policy-ori­ented work on AI al­ign­ment and other global catas­trophic risks, which is also a do­main that I feel con­fused about and have a hard time eval­u­at­ing.

For a lot of the work that I do feel com­fortable eval­u­at­ing, I ex­pect the vast ma­jor­ity of in­tel­lec­tual progress to be made in the English-speak­ing world, and as such, the ques­tion of how tal­ent can flow from Rus­sia to the ex­ist­ing com­mu­ni­ties work­ing on the long-term fu­ture seems quite im­por­tant. I hope this grant can fa­cil­i­tate a stronger con­nec­tion be­tween the rest of the world and the Rus­sian com­mu­nity, to im­prove that tal­ent and idea flow.

This grant seemed like a slightly bet­ter fit for the EA Meta fund. They de­cided not to fund it, so we made it in­stead, since it still seemed like a strong pro­posal to us.

What I have seen so far makes me con­fi­dent that this grant is a good idea. How­ever, be­fore we make more grants like this, I would want to talk more to the or­ga­niz­ers in­volved and gen­er­ally get more in­for­ma­tion on the struc­ture and cul­ture of the Rus­sian EA and ra­tio­nal­ity com­mu­ni­ties.

Ja­cob Lager­ros ($27,000)

Build­ing in­fras­truc­ture to give x-risk re­searchers su­perfore­cast­ing abil­ity with min­i­mal overhead

From the ap­pli­ca­tion:

Build a pri­vate plat­form where AI safety and policy re­searchers have di­rect ac­cess to a base of su­perfore­caster-equiv­a­lents, and where as­piring EAs with smaller op­por­tu­nity costs but ex­cel­lent cal­ibra­tion perform use­ful work.

[…]

I pre­vi­ously re­ceived two grants to work on this pro­ject: a half-time salary from EA Grants, and a grant for di­rect pro­ject ex­penses from BERI. Since then, I dropped out of a Master’s pro­gramme to work full-time on this, see­ing that was the only way I could re­ally suc­ceed at build­ing some­thing great. How­ever, dur­ing that tran­si­tion there were some lo­gis­ti­cal is­sues with other grant­mak­ers (ex­plained in more de­tail in the ap­pli­ca­tion), hence I ap­plied to the LTF for fund­ing for food, board, travel and the run­way to make more risk-neu­tral de­ci­sions and cap­ture un­ex­pected op­por­tu­ni­ties in the com­ing ~12 months of work­ing on this.”

My thoughts and reasoning

There were three main fac­tors be­hind my recom­mend­ing this grant:

  1. My ob­ject-level rea­sons for recom­mend­ing this grant are quite similar to my rea­sons for recom­mend­ing Ozzie Gooen’s and An­thony Aguirre’s.

  2. Ja­cob has been around the com­mu­nity for about 3 years. The out­put of his that I’ve seen has in­cluded (amongst other things) com­pe­tently co-di­rect­ing EAGxOxford 2016, and some thought­ful es­says on LessWrong (e.g. 1, 2, 3, 4).

  3. Ja­cob’s work seems use­ful to me, and is be­ing funded on the recom­men­da­tion of the FHI Re­search Schol­ars Pro­gramme and the Berkeley Ex­is­ten­tial Risk Ini­ti­a­tive. He is also col­lab­o­rat­ing with oth­ers I’m ex­cited about (Me­tac­u­lus and Ozzie Gooen).

How­ever, I did not as­sess the grant in de­tail, as the only rea­son Ja­cob asked for a grant was due to lo­gis­ti­cal com­pli­ca­tions with other grant­mak­ers. Since FHI and BERI have already in­ves­ti­gated the pro­ject in more de­tail, I was happy to sug­gest we pick up the slack to en­sure Ja­cob has the run­way to pur­sue his work.

Con­nor Flex­man ($20,000)

Perform in­de­pen­dent re­search in col­lab­o­ra­tion with John Salvatier

I am recom­mend­ing this grant with more hes­i­ta­tion than most of the other grants in this round. The rea­sons for hes­i­ta­tion are as fol­lows:

  • I was the pri­mary per­son on the grant com­mit­tee on whose recom­men­da­tion this grant was made.

  • Con­nor lives in the same group house that I live in, which I think adds a com­pli­cat­ing con­flict of in­ter­est to my recom­men­da­tion.

  • I have gen­er­ally pos­i­tive im­pres­sions of Con­nor, but I have not per­son­ally seen con­crete, ex­ter­nally ver­ifi­able ev­i­dence that clearly demon­strates his good judg­ment and com­pe­tence, which in com­bi­na­tion with the other two fac­tors makes me more hes­i­tant than usual.

How­ever, de­spite these reser­va­tions, I think this grant is a good choice. The two pri­mary rea­sons are:

  1. Con­nor him­self has worked on a va­ri­ety of re­search and com­mu­nity build­ing pro­jects, and both by my own as­sess­ment and other peo­ple I talked to, has sig­nifi­cant po­ten­tial in be­com­ing a strong gen­er­al­ist re­searcher, which I think is an axis on which a lot of im­por­tant pro­jects are bot­tle­necked.

  2. This grant was strongly recom­mended to me by John Sal­vatier, who is funded by an EA Grant and whose work I am gen­er­ally ex­cited about.

John did some very valuable com­mu­nity or­ga­niz­ing while he lived in Seat­tle and is now work­ing on de­vel­op­ing tech­niques to fa­cil­i­tate skill trans­fer be­tween ex­perts in differ­ent do­mains. I think it is ex­cep­tion­ally hard to de­velop effec­tive tech­niques for skill trans­fer, and more broadly tech­niques to im­prove peo­ple’s ra­tio­nal­ity and rea­son­ing skills, but am suffi­ciently im­pressed with John’s think­ing that I think he might be able to do it any­way (though I still have some reser­va­tions).

John is cur­rently col­lab­o­rat­ing with Con­nor and re­quested fund­ing to hire him to col­lab­o­rate on his pro­jects. After talk­ing to Con­nor I de­cided it would be bet­ter to recom­mend a grant to Con­nor di­rectly, en­courag­ing him to con­tinue work­ing with John but also al­low­ing him to switch to­wards other re­search pro­jects if he finds he can’t con­tribute as pro­duc­tively to John’s re­search as he ex­pects.

Over­all, while I feel some hes­i­ta­tion about this grant, I think it’s very un­likely to have any sig­nifi­cant nega­tive con­se­quences, and I as­sign some sig­nifi­cant prob­a­bil­ity that this grant can help Con­nor de­velop into an ex­cel­lent gen­er­al­ist re­searcher of a type that I feel like EA is cur­rently quite bot­tle­necked on.

Eli Tyre ($30,000)

Broad pro­ject sup­port for ra­tio­nal­ity and com­mu­nity build­ing interventions

Eli has worked on a large va­ri­ety of in­ter­est­ing and valuable pro­jects over the last few years, many of them too small to have much pay­ment in­fras­truc­ture, re­sult­ing in him do­ing a lot of work with­out ap­pro­pri­ate com­pen­sa­tion. I think his work has been a prime ex­am­ple of pick­ing low-hang­ing fruit by us­ing lo­cal in­for­ma­tion and solv­ing prob­lems that aren’t worth solv­ing at scale, and I want him to have re­sources to con­tinue work­ing in this space.

Con­crete ex­am­ples of pro­jects he has worked on that I am ex­cited about:

  • Fa­cil­i­tat­ing con­ver­sa­tions be­tween top peo­ple in AI al­ign­ment (I’ve in par­tic­u­lar heard very good things about the 3-day con­ver­sa­tion be­tween Eric Drexler and Scott Garrabrant that Eli helped fa­cil­i­tate)

  • Or­ga­niz­ing ad­vanced work­shops on Dou­ble Crux and other key ra­tio­nal­ity techniques

  • Do­ing a va­ri­ety of small in­de­pen­dent re­search pro­jects, like this eval­u­a­tion of birth or­der effects in mathematicians

  • Pro­vid­ing many new EAs and ra­tio­nal­ists with ad­vice and guidance on how to get trac­tion on work­ing on im­por­tant problems

  • Helping John Sal­vatier de­velop tech­niques around skill transfer

I think Eli has ex­cep­tional judg­ment, and the goal of this grant is to al­low him to take ac­tions with greater lev­er­age by hiring con­trac­tors, pay­ing other com­mu­nity mem­bers for ser­vices, and pay­ing for other varied ex­penses as­so­ci­ated with his pro­jects.

Robert Miles ($39,000)

Pro­duc­ing video con­tent on AI alignment

From the ap­pli­ca­tion:

My goals are:

  1. To com­mu­ni­cate to in­tel­li­gent and tech­ni­cally-minded young peo­ple that AI Safety:

    1. is full of hard, open, tech­ni­cal prob­lems which are fas­ci­nat­ing to think about

    2. is a real ex­ist­ing field of re­search, not scifi speculation

    3. is a grow­ing field, which is hiring

  2. To help oth­ers in the field com­mu­ni­cate and ad­vo­cate bet­ter, by pro­vid­ing high qual­ity, ap­proach­able ex­pla­na­tions of AIS con­cepts that peo­ple can share, in­stead of ex­plain­ing the ideas them­selves, or shar­ing tech­ni­cal doc­u­ments that peo­ple won’t read

  3. To mo­ti­vate my­self to read and in­ter­nal­ise the pa­pers and text­books, and be­come a tech­ni­cal AIS re­searcher in future

My thoughts and reasoning

I think video is a valuable medium for ex­plain­ing a va­ri­ety of differ­ent con­cepts (for the best ex­am­ples of this, see 3Blue1Brown, CGP Grey, and Khan Academy). While there are a lot of peo­ple work­ing di­rectly on im­prov­ing the long term fu­ture by writ­ing ex­plana­tory con­tent, Rob is the only per­son I know who has in­vested sig­nifi­cantly in get­ting bet­ter at pro­duc­ing video con­tent. I think this opens a unique set of op­por­tu­ni­ties for him.

The videos on his Youtube chan­nel pick up an av­er­age of ~20k views. His videos on the offi­cial Com­put­er­phile chan­nel of­ten pick up more than 100k views, in­clud­ing for top­ics like log­i­cal un­cer­tainty and cor­rigi­bil­ity (in­ci­den­tally, a term Rob came up with).

More things that make me op­ti­mistic about Rob’s broad ap­proach:

  • He ex­plains that AI al­ign­ment is a tech­ni­cal prob­lem. AI safety is not pri­mar­ily a moral or poli­ti­cal po­si­tion; the biggest chunk of the prob­lem is a mat­ter of com­puter sci­ence. Reach­ing out to a tech­ni­cal au­di­ence to ex­plain that AI safety is a tech­ni­cal prob­lem, and thus di­rectly re­lated to their pro­fes­sion, is a type of ‘out­reach’ that I’m very happy to en­dorse.

  • He does not make AI safety a poli­ti­cized mat­ter. I am very happy that Rob is not need­lessly trib­al­is­ing his con­tent, e.g. by talk­ing about some­thing like “good vs bad ML re­searchers”. He seems to sim­ply por­tray it as a set of in­ter­est­ing and im­por­tant tech­ni­cal prob­lems in the de­vel­op­ment of AGI.

  • His goal is to cre­ate in­ter­est in these prob­lems from fu­ture re­searchers, and not to sim­ply get as large of an au­di­ence as pos­si­ble. As such, Rob’s ex­pla­na­tions don’t op­ti­mize for views at the ex­pense of qual­ity ex­pla­na­tion. His videos are clearly de­signed to be en­gag­ing, but his ex­pla­na­tions are sim­ple and ac­cu­rate. Rob of­ten in­ter­acts with re­searchers in the com­mu­nity (at places like Deep­Mind and MIRI) to dis­cuss which con­cepts are in need of bet­ter ex­pla­na­tions. I don’t ex­pect Rob to take unilat­eral ac­tion in this do­main.

Rob is the first skil­led per­son in the X-risk com­mu­nity work­ing full-time on pro­duc­ing video con­tent. Be­ing the very best we have in this skill area, he is able to help the com­mu­nity in a num­ber of novel ways (for ex­am­ple, he’s already helping ex­ist­ing or­ga­ni­za­tions pro­duce videos about their ideas).

Rob made a grant re­quest dur­ing the last round, in which he ex­plic­itly re­quested fund­ing for a col­lab­o­ra­tion with RAISE to pro­duce videos for them. I cur­rently don’t think that work­ing with RAISE is the best use of Rob’s tal­ent, and I’m skep­ti­cal of the product RAISE is cur­rently try­ing to de­velop. I think it’s a bet­ter idea for Rob to fo­cus his efforts on pro­duc­ing his own videos and sup­port­ing other or­ga­ni­za­tions with his skills, though this grant doesn’t re­strict him to work­ing with any par­tic­u­lar or­ga­ni­za­tion and I want him to feel free to con­tinue work­ing on RAISE if that is the pro­ject he thinks is cur­rently most valuable.

Over­all, Rob is de­vel­op­ing a new and valuable skill within the X-risk com­mu­nity, and ex­e­cut­ing on it in a very com­pe­tent and thought­ful way, mak­ing me pretty con­fi­dent that this grant is a good idea.

MIRI ($50,000)

My thoughts and reasoning

  • MIRI is a 20-year-old re­search or­ga­ni­za­tion that seeks to re­solve the core difficul­ties in the way of AGI hav­ing a pos­i­tive im­pact.

    • My model of MIRI’s ap­proach looks some­thing like an at­tempt to join the ranks of Tur­ing, Shan­non, von Neu­mann and oth­ers, in cre­at­ing a fun­da­men­tal piece of the­ory that helps hu­man­ity to un­der­stand a wide range of pow­er­ful phe­nom­ena. Gain­ing an un­der­stand­ing of the ba­sic the­ory of in­tel­li­gent agents well enough to think clearly about them is plau­si­bly nec­es­sary for build­ing an AGI that en­sures the long term fu­ture goes well.

    • It seems to me that they are mak­ing real progress (al­though I’m not con­fi­dent of the rate of that progress) - for ex­am­ple, MIRI has dis­cov­ered a Solomonoff-in­duc­tion-style al­gorithm that can rea­son well un­der log­i­cal un­cer­tainty, learn­ing rea­son­able prob­a­bil­ities for math­e­mat­i­cal propo­si­tions be­fore they can be proved, which I found sur­pris­ing. While I am un­cer­tain about the use­ful­ness of this par­tic­u­lar in­sight on the path to fur­ther ba­sic the­ory, I take it as some ev­i­dence that they’re us­ing meth­ods that can in prin­ci­ple make progress, which is some­thing that I have his­tor­i­cally been pes­simistic about.

  • Only in re­cent years have there been routes to work­ing on al­ign­ment that have also given you fund­ing, sta­tus, and a sta­ble so­cial life. Nowa­days many oth­ers are helping out the work of solv­ing al­ign­ment, but MIRI core staff worked on the prob­lem while all the in­cen­tives pul­led in other di­rec­tions. For me this is a strong sign of their in­tegrity, and makes me ex­pect they will make good de­ci­sions in many con­texts where the best ac­tion isn’t the lo­cally in­cen­tivized ac­tion. It is also ev­i­dence that if I can’t un­der­stand why their weird ac­tion is good, that they will of­ten still be cor­rect to do it, and this is an out­side view in fa­vor of fund­ing them in cases where I don’t have my own in­side-view model of why the pro­ject they’re work­ing on is good.

  • On that note, MIRI has also worked on a num­ber of other pro­jects that have at­tempted to teach the skills be­hind their gen­eral method­ol­ogy for rea­son­ing quan­ti­ta­tively and sci­en­tifi­cally about the world and tak­ing right ac­tion. I re­gret not hav­ing the time to de­tail all the im­pacts of these pro­jects, but they in­clude (and are not limited to): LessWrong, The Se­quences, HPMOR, Inad­e­quate Equil­ibria, Embed­ded Agency, and CFAR (an or­ga­ni­za­tion I dis­cuss be­low). I view these as some of the main rea­sons the x-risk com­mu­nity ex­ists.

  • Another out­side view to con­sider is the sup­port of MIRI by so many oth­ers whom I trust. Their fun­ders have in­cluded Open Phil, BERI, FLI, and Jaan Tal­linn, plus a va­ri­ety of smaller donors I trust, and they are ad­vised by Stu­art Rus­sell and Nick Bostrom. They’ve also been sup­ported by other peo­ple who I don’t nec­es­sar­ily trust di­rectly, but who I do think have in­ter­est­ing and valuable per­spec­tives on the world, like Peter Thiel and Vi­talik Bu­terin .

  • I also judge the staff to be ex­cep­tion­ally com­pe­tent. Some ex­am­ples:

    • The pro­gram­ming team has taken very early hires from mul­ti­ple good star­tups such as Triple­byte, Re­cur­sion Phar­ma­ceu­ti­cals, and Quixey, and also in­cludes the Haskell core-de­vel­oper Ed­ward Kmett.

    • The ops staff are cur­rently, in my eval­u­a­tion, the most com­pe­tent op­er­a­tions team of any of the or­ga­ni­za­tions that I have per­son­ally in­ter­acted with.

In sum, I think MIRI is one of the most com­pe­tent and skil­led teams at­tempt­ing to im­prove the long-term fu­ture, I have a lot of trust in their de­ci­sion-mak­ing, and I’m strongly in fa­vor of en­sur­ing that they’re able to con­tinue their work.

Thoughts on fund­ing gaps

De­spite all of this, I have not ac­tu­ally recom­mended a large grant to MIRI.

  • This is due to MIRI’s fund­ing situ­a­tion be­ing solid at its cur­rent level (I would be think­ing very differ­ently if I an­nu­ally had tens of mil­lions of dol­lars to give away). But MIRI’s marginal use of dol­lars at this point of fund­ing seems lower-im­pact, so I only recom­mended $50k.

  • I feel con­flicted about whether it might be bet­ter to give MIRI more money. His­tor­i­cally, it has been com­mon in the EA fund­ing land­scape to only give fund­ing to or­ga­ni­za­tions when they have demon­strated con­crete room for more fund­ing, or when fund­ing is the main bot­tle­neck for the or­ga­ni­za­tion. I think this has al­lowed us to start many small or­ga­ni­za­tions that are work­ing on a va­ri­ety of differ­ent prob­lems.

    • A com­mon way in which at least some fund­ing de­ci­sions are made is to com­pare the effect of a marginal dona­tion now with the effect of a marginal dona­tion at an ear­lier point in the pro­ject’s life­cy­cle (i.e. not want­ing to in­vest in a pro­ject af­ter it has hit strongly diminish­ing marginal re­turns, aka “maxed out its room for more fund­ing” or “filled the fund­ing gap”).

    • How­ever, when I think about this from first prin­ci­ples, I think we should ex­pect a heavy-tailed (prob­a­bly log-nor­mal) dis­tri­bu­tion in the im­pact of differ­ent cause ar­eas, in­di­vi­d­u­als, and pro­jects. And while I can imag­ine that many good op­por­tu­ni­ties might hit strong diminish­ing marginal re­turns early on, it doesn’t seem likely for most pro­jects. In­stead, I ex­pect fac­tors that stay con­stant over the life of a pro­ject, like its broader or­ga­ni­za­tional philos­o­phy, core staff, and choice of prob­lem to solve, to de­ter­mine a large part its marginal value. Thus, we should ex­pect our best guesses to be worth in­vest­ing sig­nifi­cant fur­ther re­sources into.

How­ever, this is all com­pli­cated by a va­ri­ety of coun­ter­vailing con­sid­er­a­tions, such as the fol­low­ing three:

  1. Power law dis­tri­bu­tions of im­pact only re­ally mat­ter in this way if we can iden­tify which in­ter­ven­tions we ex­pect to be in the right tail of im­pact, and I have a lot of trou­ble prop­erly bound­ing my un­cer­tainty here.

  2. If we are faced with sig­nifi­cant un­cer­tainty about cause ar­eas, and we need or­ga­ni­za­tions to have worked in an area for a long time be­fore we can come to ac­cu­rate es­ti­mates about its im­pact, then it’s a good idea to in­vest in a broad range of or­ga­ni­za­tions in an at­tempt to get more in­for­ma­tion. This is re­lated to com­mon ar­gu­ments around “ex­plore/​ex­ploit trade­offs”.

  3. Some­times, mak­ing large amounts of fund­ing available to one or­ga­ni­za­tion can have nega­tive con­se­quences for the broader ecosys­tem of a cause area. Also, giv­ing an or­ga­ni­za­tion ac­cess to more fund­ing than it can use pro­duc­tively may cause it to make too many hires or lose fo­cus by try­ing to scale too quickly. Hav­ing more fund­ing of­ten also at­tracts ad­ver­sar­ial ac­tors and in­creases com­pet­i­tive stakes within an or­ga­ni­za­tion, mak­ing it a more likely tar­get for at­tack­ers.

I can see ar­gu­ments that we should ex­pect ad­di­tional fund­ing for the best teams to be spent well, even ac­count­ing for diminish­ing mar­gins, but on the other hand I can see many meta-level con­cerns that weigh against ex­tra fund­ing in such cases. Over­all, I find my­self con­fused about the marginal value of giv­ing MIRI more money, and will think more about that be­tween now and the next grant round.

CFAR ($150,000)

[Edit: It seems rele­vant to men­tion that LessWrong is cur­rently re­ceiv­ing op­er­a­tional sup­port from CFAR, in a way that makes me tech­ni­cally an em­ployee of CFAR (similar to how ACE and 80K were/​are part of CEA for a long time). How­ever, LessWrong op­er­ates as a com­pletely sep­a­rate en­tity with its own fundrais­ing and hiring pro­ce­dures, and I don’t feel any hes­i­ta­tion or pres­sure to cri­tique CFAR openly be­cause of that re­la­tion. Though I find my­self a tiny bit more hes­i­tant to speak harshly of spe­cific in­di­vi­d­u­als, sim­ply be­cause I am only work­ing a floor away from the CFAR offices and that does have some psy­cholog­i­cal effect on me. Though the same was true for CEA while LessWrong was lo­cated in the CEA office for a few months, and was true for res­i­dents of my group house while LessWrong was lo­cated in the liv­ing room of my group house for most of the past two years, so I don’t think this effect is par­tic­u­larly large.]

I think that CFAR’s in­tro work­shops have his­tor­i­cally had a lot of pos­i­tive im­pact. I think they have done so via three path­ways.

  1. Estab­lish­ing epistemic norms: I think CFAR work­shops are quite good at helping the EA and ra­tio­nal­ity com­mu­nity es­tab­lish norms about what good dis­course and good rea­son­ing look like. As a con­crete ex­am­ple of this, the con­cept of Dou­ble Crux has got­ten trac­tion in the EA and ra­tio­nal­ity com­mu­ni­ties, which has im­proved the way ideas and in­for­ma­tion spread through­out the com­mu­nity, how ideas get eval­u­ated, and what kinds of pro­jects get re­sources. More broadly, I think CFAR work­shops have helped in es­tab­lish­ing a set of com­mon norms about what good rea­son­ing and un­der­stand­ing look like, similar to the effect of the se­quences on LessWrong.

    1. I think that it’s pos­si­ble that the ma­jor­ity of the value of the EA and ra­tio­nal­ity com­mu­ni­ties comes from hav­ing that set of shared epistemic norms that al­lows them to rea­son col­lab­o­ra­tively in a way that most other com­mu­ni­ties can­not (in the same way that what makes sci­ence work is a set of shared norms around what con­sti­tutes valid ev­i­dence and how new knowl­edge gets cre­ated).

    2. As an ex­am­ple of the im­por­tance of this: I think a lot of the ini­tial ar­gu­ments for why AI risk is a real con­cern were “weird” in a way that was not eas­ily com­pat­i­ble with a naive em­piri­cist wor­ld­view that I think is pretty com­mon in the broader in­tel­lec­tual world.

      1. In par­tic­u­lar, the ar­gu­ments for AI risk are hard to test with ex­per­i­ments or em­piri­cal stud­ies, but hold up from the per­spec­tive of log­i­cal and philo­soph­i­cal rea­son­ing and are gen­er­ated by a va­ri­ety of good mod­els of broader tech­nolog­i­cal progress, game the­ory, and re­lated ar­eas of study. But for those ar­gu­ments to find trac­tion, they re­quired a group of peo­ple with the rele­vant skills and habits of thought for gen­er­at­ing, eval­u­at­ing, and hav­ing ex­tended in­tel­lec­tual dis­course about these kinds of ar­gu­ments.

  2. Train­ing: A per­centage of in­tro work­shop par­ti­ci­pants (many of whom were already work­ing on im­por­tant prob­lems within X-risk) have seen sig­nifi­cant im­prove­ments in com­pe­tence; as a re­sult, they be­came sub­stan­tially more effec­tive in their work.

  3. Re­cruit­ment: CFAR has helped many peo­ple move from pas­sive mem­ber­ship in the EA and ra­tio­nal­ity com­mu­nity to hav­ing strong so­cial bonds in the X-risk net­work.

While I do think that CFAR has his­tor­i­cally caused a sig­nifi­cant amount of im­pact, I feel hes­i­tant about this grant be­cause I am un­sure whether CFAR can con­tinue to cre­ate the same amount of im­pact in the fu­ture. I have a few rea­sons for this:

  • First, all of its found­ing staff and many other early staff have left. I broadly ex­pect or­ga­ni­za­tions to get a lot worse once their early staff leaves.

    • Some ex­am­ples of peo­ple who left af­ter work­ing there:

      • Ju­lia Galef (left a few years ago to start the Up­date Pro­ject)

      • An­drew Critch (left to join first Jane Street, then MIRI, then founded CHAI and BERI)

      • Kenzi Askhie

      • Dun­can Sabien

    • Anna Sala­mon has re­duced her in­volve­ment in the last few years and seems sig­nifi­cantly less in­volved with the broader strate­gic di­rec­tion of CFAR (though she is still in­volved in some of the day-to-day op­er­a­tions, cur­ricu­lum de­vel­op­ment, and more re­cent CFAR pro­gram­mer work­shops). [Note: After talk­ing to Anna about this, I am now less cer­tain of whether this ac­tu­ally ap­plies and am cur­rently con­fused on this point]

    • Dun­can Sa­bien is no longer in­volved in day-to-day work, but still does some amount of teach­ing at in­tro work­shops and pro­gram­mer work­shops (though I think he is plan­ning to phase that out) and will help with the up­com­ing in­struc­tor train­ing.

    • I think that Ju­lia, Anna and Critch have all worked on pro­jects of enor­mous im­por­tance, and their work over the last few years has clearly demon­strated a level of com­pe­tence that makes me ex­pect that CFAR will strug­gle to main­tain its level of qual­ity with their in­volve­ment sig­nifi­cantly re­duced.

  • From re­cent con­ver­sa­tions with CFAR, I’ve got­ten a sense that the staff isn’t in­ter­ested in in­creas­ing the num­ber of in­tro work­shops, that the in­tro work­shops don’t feel par­tic­u­larly ex­cit­ing for the staff, and that most staff are less in­ter­ested in im­prov­ing the in­tro work­shops than other parts of CFAR. This makes it less likely that those work­shops will main­tain their qual­ity and im­pact, and I cur­rently think that those work­shops are likely one of the best ways for CFAR to have a large im­pact.

  • I have a gen­eral sense that CFAR is strug­gling to at­tract top tal­ent, par­tially be­cause some of the best staff left, and par­tially due to a gen­eral sense of a lack of for­ward mo­men­tum for the or­ga­ni­za­tion. This is a bad sign, be­cause I think CFAR in par­tic­u­lar benefits from hav­ing highly tal­ented in­di­vi­d­u­als teach at their work­shops and serve as a con­crete ex­am­ple of the skills they’re try­ing to teach.

  • My im­pres­sion is that while the in­tro work­shops were his­tor­i­cally fo­cused on in­stru­men­tal ra­tio­nal­ity and per­sonal pro­duc­tivity, the origi­nal CFAR staff was ori­ented quite strongly around truth-seek­ing. Core ra­tio­nal­ity con­cepts were con­veyed in­di­rectly by the staff in smaller con­ver­sa­tions and in the broader cul­ture of the or­ga­ni­za­tion. The cur­rent staff seems less ori­ented around that kind of epistemic ra­tio­nal­ity, and so I ex­pect that if they con­tinue their cur­rent fo­cus on per­sonal pro­duc­tivity and in­stru­men­tal ra­tio­nal­ity, the epistemic benefits of CFAR work­shops will be re­duced sig­nifi­cantly, and those are the benefits I care about most.

How­ever, there are some ad­di­tional con­sid­er­a­tions that led me to recom­mend­ing this grant.

  • First, CFAR and MIRI are col­lab­o­rat­ing on a set of pro­gram­mer-fo­cused work­shops that I am also quite pos­i­tive on. I think those work­shops are less di­rectly in­fluenced by coun­ter­fac­tual dona­tions than the main­line work­shops, since I ex­pect MIRI to fund them in any case, but they do still rely on CFAR ex­ist­ing as an in­sti­tu­tion that can provide in­struc­tors. I am ex­cited about the op­por­tu­ni­ties the work­shops will en­able in terms of cur­ricu­lum de­vel­op­ment, since they can fo­cus al­most solely on epistemic rationality

  • Se­cond, I think that if CFAR does not re­ceive a grant now, there’s a good chance they’d be forced to let sig­nifi­cant por­tions of their staff go, or take some other ir­re­versible ac­tion. CFAR de­cided not to run a fundraiser last fall be­cause they felt like they’d made sig­nifi­cant mis­takes sur­round­ing a de­ci­sion made by a com­mu­nity dis­pute panel that they set up and were re­spon­si­ble for, and they felt like it would be in poor taste to ask the com­mu­nity for money be­fore they thor­oughly in­ves­ti­gated what went wrong and re­leased a pub­lic state­ment.

    • I think this was the cor­rect course of ac­tion, and I think over­all CFAR’s re­sponse to the mis­takes they made last year has been quite good.

    • The lack of a fundraiser led CFAR to have a much greater need for fund­ing than usual, and a grant this round will likely make a sig­nifi­cant differ­ence in CFAR’s fu­ture.

In the last year, I had some con­cerns about the way CFAR com­mu­ni­cated a lot of its in­sights, and I sensed an in­suffi­cient em­pha­sis on a kind of ro­bust and trans­par­ent rea­son­ing that I don’t have a great name for. I don’t think the com­mu­ni­ca­tion style I was ad­vo­cat­ing for is always the best way to make new dis­cov­er­ies, but is very im­por­tant for es­tab­lish­ing broader com­mu­nity-wide epistemic norms and en­ables a kind of long-term in­tel­lec­tual progress that I think is nec­es­sary for solv­ing the in­tel­lec­tual challenges we’ll need to over­come to avoid global catas­trophic risks. I think CFAR is likely to re­spond to last year’s events by im­prov­ing their com­mu­ni­ca­tion and rea­son­ing style in this re­spect (from my per­spec­tive).

My over­all read is that CFAR is perform­ing a va­ri­ety of valuable com­mu­nity func­tions and has a strong enough track record that I want to make sure that it can con­tinue ex­ist­ing as an in­sti­tu­tion. I didn’t have enough time this grant round to un­der­stand how the fu­ture of CFAR will play out; the cur­rent grant amount seems suffi­cient to en­sure that CFAR does not have to take any dras­tic ac­tion un­til our next grant round. By the next grant round, I plan to have spent more time learn­ing and think­ing about CFAR’s tra­jec­tory and fu­ture, and to have a more con­fi­dent opinion about what the cor­rect fund­ing level for CFAR is.