When should EAs allocate funding randomly? An inconclusive literature review.

Sum­mary: I con­sid­ered the ques­tion Un­der what con­di­tions should the tar­gets of EA fund­ing be cho­sen ran­domly? I re­viewed pub­li­ca­tions on the perfor­mance of ran­dom de­ci­sion strate­gies, which I ini­tially sus­pected might sup­port ran­dom­ized fund­ing in some situ­a­tions. In this post, I ex­plain why I now think these pub­li­ca­tions provide very lit­tle guidance on fund­ing al­lo­ca­tion. Over­all I re­main un­cer­tain whether one could im­prove, say, EA Grants or Open Phil’s grant­mak­ing by in­tro­duc­ing some ran­dom el­e­ment.

I spent about 90 hours on this pro­ject as part of a Sum­mer Re­search Fel­low­ship at the Cen­tre for Effec­tive Altru­ism.

[This post is quite long, but un­for­tu­nately I wasn’t able to in­clude a table of con­tents. If it’s pos­si­ble to link to sec­tions of this post, please let me know.]


Re­search ques­tion, scope, and terminology

Re­search ques­tion. The ques­tion I aimed to in­ves­ti­gate was: Un­der what con­di­tions should the tar­gets of EA fund­ing be cho­sen ran­domly? To this end I also looked at the perfor­mance of ran­dom strate­gies for de­ci­sions other than fund­ing. Since I never found pos­i­tive con­clu­sions about fund­ing, the in­tended fo­cus on fund­ing isn’t well re­flected in this re­port, which fo­cuses on nega­tive con­clu­sions.

Cri­te­ria for eval­u­at­ing de­ci­sion strate­gies. I was most in­ter­ested in ex-ante effec­tive­ness, un­der­stood as max­i­miz­ing the ex­pected value of some quan­tity be­fore a de­ci­sion. I didn’t in­ves­ti­gate other crite­ria such as fair­ness, cost, or in­cen­tive effects, but will some­times briefly dis­cuss these.

Scope. I was think­ing of situ­a­tions where an EA in­di­vi­d­ual or or­ga­ni­za­tion al­lo­cates fund­ing be­tween tar­gets that would then use this fund­ing with the ul­ti­mate aim to help oth­ers. One ex­am­ple are dona­tions to char­i­ties. I was not in­ter­ested in trans­fers of money prior to such a de­ci­sion; in par­tic­u­lar, the use of lot­ter­ies to pool re­sources and en­able economies of scale – as in donor lot­ter­ies – was out­side the scope of this pro­ject. Nei­ther was I in­ter­ested in situ­a­tions where fi­nan­cial trans­fers are in­tended to pri­mar­ily help the re­cip­i­ents, as in so­cial welfare or GiveDirectly’s cash trans­fers. Fi­nally, I did not con­sider how the fund­ing de­ci­sions of differ­ent fun­ders in­ter­act, as in ques­tions around fung­ing. [1]

Ter­minol­ogy. I’ll use ran­dom strat­egy or lot­tery to re­fer to de­ci­sion mechanisms that use de­liber­ate [2] ran­dom­iza­tion, as in flip­ping a coin. The ran­dom­iza­tion need not be uniform, e.g. it could in­volve the flip of a bi­ased coin. Lot­ter­ies can use mechanisms other than ran­dom­iza­tion; for ex­am­ple, al­lo­cat­ing only a part of all fund­ing ran­domly or ran­dom­iz­ing only be­tween a sub­set of op­tions would count as lot­ter­ies. I’ll re­fer to strate­gies that aren’t lot­ter­ies as non­ran­dom.

Why I was interested

Sev­eral pub­li­ca­tions claim that, when al­lo­cat­ing re­sources, lot­ter­ies can out­perform cer­tain non­ran­dom strate­gies such as grant peer re­view or pro­mo­tions based on past perfor­mance. I found such claims for pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions, se­lect­ing mem­bers of a par­li­a­ment, fi­nan­cial trad­ing, and sci­ence fund­ing (see sec­tion Sur­vey of the liter­a­ture for refer­ences). They were based on both em­piri­cal ar­gu­ments and math­e­mat­i­cal mod­els.

I was cu­ri­ous whether some of these find­ings might ap­ply to de­ci­sions fre­quently en­coun­tered by EAs. For ex­am­ple, the fol­low­ing are com­mon in EA:

  • In­di­vi­d­ual donors se­lect­ing a char­ity.

  • In­sti­tu­tional fun­ders spon­sor­ing in­di­vi­d­u­als, e.g. CEA’s EA Grants, Open Phil’s AI Fel­lows Pro­gram.

  • Rel­a­tively con­ven­tional sci­ence fund­ing, e.g. Open Phil’s Scien­tific Re­search cat­e­gory or some of their AI safety and strat­egy grants.

For refer­ence, I’m in­clud­ing a link to my origi­nal re­search pro­posal with which I ap­plied to CEA’s Sum­mer Re­search Fel­low­ship.

Sur­vey of the pub­li­ca­tions I reviewed

This table ex­hibits the most rele­vant pa­pers I re­viewed. Note that my re­view wasn’t sys­tem­atic; see the sub­sec­tion Limi­ta­tions for more de­tail on this and other limi­ta­tions.

Some con­text on the table:

  • The column on Type of ar­gu­ment uses the fol­low­ing cat­e­gories and sub­cat­e­gories:

    • De­duc­tive: Math­e­mat­i­cal proofs and an­a­lytic solu­tions – as op­posed to ap­prox­i­ma­tions – of quan­ti­ta­tive mod­els.

    • Em­piri­cal: Ar­gu­ments based on data or anec­dotes from the real world.

      • Qual­i­ta­tive: Non-quan­ti­ta­tive dis­cus­sion, usu­ally case stud­ies.

      • Sur­vey: Re­sponses of hu­man sub­jects to stan­dard­ized ques­tion­naires.

      • Ret­ro­spec­tive: Data from non-ex­per­i­men­tal set­tings cov­er­ing some ex­tended pe­riod of time in the past.

      • Lab ex­per­i­ments: Data from lab­o­ra­tory ex­per­i­ments of the type com­mon in psy­chol­ogy and be­hav­ioral eco­nomics.

    • MC simu­la­tion: Monte Carlo simu­la­tion, i.e. re­peat­edly run­ning a stochas­tic al­gorithm and re­port­ing the av­er­age re­sults.

      • Agent-based model: Com­po­si­tional mod­els that in­clude parts in­tended to rep­re­sent in­di­vi­d­ual agents, typ­i­cally track­ing their de­vel­op­ment and in­ter­ac­tion over time to in­ves­ti­gate the effects on some ag­gre­gate or macro-level prop­erty.

      • Sim­ple model: Here de­notes any quan­ti­ta­tive model that isn’t clearly agent-based.

    • Com­pre­hen­sive: Paper pro­vides or cites ar­gu­ments from sev­eral of the above types to make an all-things-con­sid­ered case for its con­clu­sion.

  • I also re­viewed pub­li­ca­tions that don’t ex­plic­itly dis­cuss lot­ter­ies when I sus­pected that their con­tent might be rele­vant. Usu­ally this was be­cause they seemed to re­veal some short­com­ing of non­ran­dom strate­gies. In these cases I re­port the most rele­vant claim in the column Stance on lot­ter­ies.

  • The table only cov­ers those parts of a pub­li­ca­tion that are rele­vant to my topic. For ex­am­ple, Frank’s (2006) simu­la­tions are only a part of his over­all dis­cus­sion that cul­mi­nates in him ar­gu­ing for a pro­gres­sive con­sump­tion tax.

Strength and scope of the en­dorse­ment of lot­ter­ies in the literature

Sev­eral pub­li­ca­tions I re­viewed make claims that are about de­ci­sion situ­a­tions in the real world, or at least can rea­son­ably in­ter­preted as such by read­ers who only read, say, their ab­stracts or con­clu­sion sec­tions. Ex­am­ples in­clude:

“The pro­pos­als [for in­tro­duc­ing ran­dom el­e­ments into re­search fund­ing] have been sup­ported on effi­ciency grounds, with mod­els, in­clud­ing so­cial episte­mol­ogy mod­els, show­ing ran­dom al­lo­ca­tion could in­crease the gen­er­a­tion of sig­nifi­cant truths in a com­mu­nity of sci­en­tists when com­pared to fund­ing by peer re­view.” (Avin, 2018, p. 1)
“We also com­pare sev­eral policy hy­pothe­ses to show the most effi­cient strate­gies for pub­lic fund­ing of re­search, aiming to im­prove mer­i­toc­racy, di­ver­sity of ideas and in­no­va­tion.” (Pluch­ino et al., 2018, p. 1850014-2)
“ This means that a Par­li­a­ment with­out leg­is­la­tors free from the in­fluence of Par­ties turns out to be rather in­effi­cient (as prob­a­bly hap­pens in re­al­ity).” (Pluch­ino et al., 2011b, p. 3948)
“In con­clu­sion, our study pro­vides rigor­ous ar­gu­ments in fa­vor of the idea that the in­tro­duc­tion of ran­dom se­lec­tion sys­tems, re­dis­cov­er­ing the wis­dom and the his­tory of an­cient democ­ra­cies, would be broadly benefi­cial for mod­ern in­sti­tu­tions.” (Pluch­ino et al., 2011b, p. 3953)
“Fi­nally, we ex­pect that there would be also sev­eral other so­cial situ­a­tions, be­yond the Par­li­a­ment, where the in­tro­duc­tion of ran­dom mem­bers could be of help in im­prov­ing the effi­ciency.” (Pluch­ino et al., 2011b, p. 3953)
“[T]he re­cent dis­cov­ery that the adop­tion of ran­dom strate­gies can im­prove the effi­ciency of hi­er­ar­chi­cal or­ga­ni­za­tions” (Pluch­ino et al., 2011b, p. 3944, about their 2010 and 2011a)
“In all the ex­am­ples we have pre­sented, a com­mon fea­ture strongly emerges: the effi­ciency of an or­ga­ni­za­tion in­creases sig­nifi­cantly if one adopts a ran­dom strat­egy of pro­mo­tion with re­spect to a sim­ple mer­i­to­cratic pro­mo­tion of the best mem­bers.” (Pluch­ino et al., 2011a, p. 3505)
“We think that these re­sults could be use­ful to guide the man­age­ment of large real hi­er­ar­chi­cal sys­tems of differ­ent na­tures and in differ­ent fields.” (Pluch­ino et al., 2010, p. 471)
“It may well be, for ex­am­ple, that when there are many more de­serv­ing con­tes­tants than di­visi­ble re­sources (e.g. ten good ap­pli­cants for five jobs), the fi­nal se­lec­tion should be ex­plic­itly and pub­li­cly made by lot­tery.” (Thorn­gate, 1988, p. 14)

Elster (1989, p. 116) asks “why lot­ter­ies are so rarely used when there are so many good ar­gu­ments for us­ing them”. Neu­rath (1913, p. 11; via [3] Elster, 1989, pp. 121f.) even de­scribed the abil­ity to use lot­ter­ies when there is in­suffi­cient ev­i­dence for a de­liber­ate de­ci­sion as the fi­nal of “four stages of de­vel­op­ment of mankind”.

There also are dis­sent­ing voices, e.g. Hofs­tee (1990). How­ever, my ap­proach was to as­sess the ar­gu­ments fa­vor­ing lot­ter­ies my­self rather than to search for coun­ter­ar­gu­ments in the liter­a­ture. I there­fore haven’t tried to provide a rep­re­sen­ta­tive sam­ple of the liter­a­ture. Even if I had, I would ex­pect ex­plicit anti-lot­tery po­si­tions to be un­der­rep­re­sented be­cause not us­ing a lot­tery might seem like a de­fault po­si­tion not worth ar­gu­ing for.

Real-world use of lotteries

I also came across refer­ences to myth­i­cal, his­toric, and con­tem­po­rary cases in which lot­ter­ies were or are be­ing used. Th­ese sug­gest that there are rea­sons – though not nec­es­sar­ily re­lated to effec­tive­ness – fa­vor­ing lot­ter­ies in at least some real-world situ­a­tions. I didn’t in­ves­ti­gate these cases fur­ther, but briefly list some sur­veys:

  • Elster (1989, sc. II.4-II.7, pp. 62-103).

    • Elster (1989, p. 104) says he aimed for his list to be “rea­son­ably ex­haus­tive”, and that he would be “sur­prised if I have missed any ma­jor ex­am­ples” (ibid., p. 104)

    • Elster (1989, pp. 104f.) lists the fol­low­ing pat­terns in the use cases sur­veyed by him:

      • More fre­quent in democ­ra­cies or pop­u­lar es­tates.

      • When they can be in­ter­preted as an ex­pres­sion of God’s will.

      • As­sign­ing peo­ple for le­gal and ad­minis­tra­tive tasks.

      • Allo­cat­ing bur­dens – as op­posed to goods – to peo­ple.

  • Boyce (1994, pp.. 457f.) de­scribes some bibli­cal, his­toric, and mod­ern use cases.

  • Boyle lists dozens of cases on his web­site. [4]

  • At least three in­sti­tu­tional sci­ence fun­ders have im­ple­mented lot­ter­ies (Avin, 2018, 1f.).


I here list some limi­ta­tions of my liter­a­ture re­view. I chose not to pur­sue these points ei­ther be­cause I was pes­simistic about their value, or be­cause they seemed too far re­moved from my re­search ques­tion (while po­ten­tially be­ing in­ter­est­ing in other con­texts).

  • My re­view wasn’t sys­tem­atic or com­pre­hen­sive. I started from the pub­li­ca­tions I had first heard of, which were Avin’s work on sci­ence fund­ing and the refer­ences in a Scien­tific Amer­i­can blog post by psy­chol­o­gist Scott Barry Kauf­man.

  • I gen­er­ally didn’t try to in­de­pen­dently ver­ify re­sults. In par­tic­u­lar, I nei­ther repli­cated simu­la­tions nor did I check de­tails of calcu­la­tions or proofs. Below, I do note where my im­pres­sion is that a pa­per’s stated con­clu­sions aren’t sup­ported by its re­ported re­sults, or where one pub­li­ca­tion seem to mis­rep­re­sent an­other. How­ever, I haven’t tried to iden­tify all prob­lems and merely re­port those I hap­pened to no­tice.

  • I didn’t (or not ex­ten­sively) re­view:

    • The fol­low­ing three books, which might be in­ter­est­ing to re­view. [5]

      • Mauboussin (2012)

      • Si­mon­ton (2004)

      • Not ex­ten­sively Elster (1989)

    • Not ex­ten­sively the liter­a­ture in what Liu and de Rond (2016, p. 12) call the “ran­dom school of thought in man­age­ment”, e.g. the work of Jerker Den­rell. The cen­tral claim of that school ap­pears to be that “ran­dom vari­a­tion should be con­sid­ered one of the most im­por­tant ex­plana­tory mechanisms in the man­age­ment sci­ences” (Den­rell et al., 2014).

    • Not ex­ten­sively the case of sci­ence fund­ing. E.g., I didn’t con­sult refer­ences 69-72 in Pluch­ino et al. (2018), or most of the refer­ences cited by Avin.

    • Not ex­ten­sively the case for ran­dom se­lec­tion of mem­bers of par­li­a­ment, known as de­marchy or sor­ti­tion.

    • Work by Nas­sim Taleb that was some­times cited as be­ing rele­vant, speci­fi­cally his books The Black Swan: The Im­pact of the Highly Im­prob­a­ble and Fooled by Ran­dom­ness: The Hid­den Role of Chance in Life and in the Mar­kets.

    • Em­piri­cal and the­o­ret­i­cal work on the re­li­a­bil­ity and val­idity of rele­vant pre­dic­tions of perfor­mance or im­pact in fund­ing de­ci­sions, e.g. the work of Philip Tet­lock, Martin et al. (2016), or Bar­nett’s (2008, p. 231) claim that in cer­tain con­di­tions “or­ga­ni­za­tions are likely to fall into com­pe­tency traps, mis­ap­ply­ing to new com­pet­i­tive re­al­ities les­sons that were learned un­der differ­ent cir­cum­stances”.

    • Em­piri­cal and the­o­ret­i­cal work on the con­di­tions un­der which we might ex­pect that the op­tions from which we se­lect are in some sense equally good, e.g. work re­lated to the Effi­cient-Mar­ket Hy­poth­e­sis.

My con­clu­sions from the liter­a­ture review

In sum­mary, I think that the pub­li­ca­tions I re­viewed:

  1. De­mon­strate that it’s con­cep­tu­ally pos­si­ble that de­cid­ing by lot­tery can have a strictly larger ex-ante ex­pected value than de­cid­ing by some non­ran­dom pro­ce­dures, even when the lat­ter aren’t as­sumed to be ob­vi­ously bad or to have high cost.

  2. Provide some qual­i­ta­tive sug­ges­tions for con­di­tions un­der which lot­ter­ies might have this prop­erty.

  3. Don’t by them­selves es­tab­lish that this benefi­cial po­ten­tial of lot­ter­ies is com­mon, or that the benefi­cial effect would be large for some par­tic­u­lar EA fund­ing de­ci­sion.

  4. Don’t provide a method to de­ter­mine the perfor­mance of lot­ter­ies that would be eas­ily ap­pli­ca­ble to any spe­cific EA fund­ing de­ci­sion.

  5. Over­all sug­gest that the case for lot­ter­ies is strongest in situ­a­tions that are most similar to in­sti­tu­tional sci­ence fund­ing. How­ever, even in such cases it re­mains un­clear whether lot­ter­ies are strictly op­ti­mal.

In the fol­low­ing sub­sec­tions, I’ll first give my rea­son­ing be­hind the nega­tive con­clu­sions 3. to 5. I’ll then ex­plain the two first, pos­i­tive con­clu­sions.

Un­for­tu­nately, the pos­i­tive con­clu­sions 1. and 2. are weak. I also be­lieve they are rel­a­tively ob­vi­ous, and are sup­ported by shal­low con­cep­tual con­sid­er­a­tions as well as com­mon sense, nei­ther of which re­quire a liter­a­ture re­view.

Why I don’t be­lieve these pub­li­ca­tions by them­selves sig­nifi­cantly sup­port lot­ter­ies in EA fund­ing decisions

I’ll first give sev­eral rea­sons that limit the rele­vance of the liter­a­ture for my re­search ques­tion. While not all rea­sons ap­ply to all pub­li­ca­tions I re­viewed, at least one rea­son ap­plies to most. In a fi­nal sub­sub­sec­tion, I ex­plain why the nega­tive con­clu­sions 3. and 4. men­tioned above fol­low.

Re­sults out­side of the scope of my in­ves­ti­ga­tion, and thus per­haps gen­er­ally less rele­vant in an EA context

In an EA con­text we want to max­i­mize ex­pected util­ity ex ante, i.e. be­fore some de­ci­sion. How­ever, some of the claims in the liter­a­ture are ex post ob­ser­va­tions, or con­cern ex ante prop­er­ties other than the ex­pected value.

***Short­com­ings of al­ter­na­tives to lot­ter­ies that don’t im­ply lot­ter­ies are su­pe­rior ac­cord­ing to any crite­rion.***

Th­ese claims are made in the fol­low­ing con­text. We are con­cerned with de­ci­sions be­tween op­tions that each have an as­so­ci­ated quan­ti­ta­tive value. We want to se­lect the high­est-value op­tion. How­ever, we only have ac­cess to noisy mea­sure­ments of an op­tion’s value. That is, if we mea­sured the value of the same op­tion sev­eral times, there would be some ‘ran­dom’ vari­a­tion in the mea­sured val­ues. For ex­am­ple, when mak­ing a hiring de­ci­sion we might use the quan­tified perfor­mance on a work test as mea­sure­ments; the scores on the tests would vary even for the same ap­pli­cant be­cause of ‘ran­dom’ vari­a­tions in day-to-day perfor­mance, or ‘ran­dom’ ac­ci­den­tal er­rors made when eval­u­at­ing the tests.

One nat­u­ral de­ci­sion pro­ce­dure in such a situ­a­tion is to se­lect the op­tion with the high­est mea­sured value; I’ll call this op­tion the win­ning op­tion. I en­coun­tered sev­eral claims that might be viewed as short­com­ings of the de­ci­sion pro­ce­dure to always choose the win­ning op­tion. I was in­ter­ested in iden­ti­fy­ing such short­com­ings in or­der to then in­ves­ti­gate whether lot­ter­ies avoid them. I’ll now list the rele­vant claims, and ex­plain why I be­lieve they can­not be used to vin­di­cate lot­ter­ies ac­cord­ing to any crite­rion (and so in par­tic­u­lar not the ones I’m in­ter­ested in).

The ‘op­ti­mizer’s curse’, or ‘post-de­ci­sion re­gret’. Smith and Win­kler (2006) prove that, un­der mild con­di­tions, the win­ning op­tion’s mea­sured value sys­tem­at­i­cally over­es­ti­mates its ac­tual value. This holds even when all mea­sure­ments are un­bi­ased, i.e. the ex­pected value of each mea­sure­ment co­in­cides with the ac­tual value of the mea­sured op­tion.

How­ever, their re­sult doesn’t im­ply that the ex-ante ex­pected value of choos­ing an­other op­tion would have been higher. In fact, I be­lieve that choos­ing the win­ning op­tion does max­i­mize ex­pected value if all mea­sure­ments are un­bi­ased and their re­li­a­bil­ity doesn’t vary too much. [6]

The win­ning op­tion may be un­likely to be the best one. Thorn­gate and Car­roll (1987), Frank (2016, p. 157, Fig. A1.2), and Pluch­ino et al. (2018, p. 1850014-13, Fig. 7), each us­ing some­what differ­ent as­sump­tions, demon­strate that the ab­solute prob­a­bil­ity of the win­ning op­tion ac­tu­ally be­ing the high­est-value one can be small.

How­ever, more rele­vant for our pur­poses is the rel­a­tive value of that prob­a­bil­ity: can we iden­tify an op­tion other than the win­ning one that is more likely to be the high­est-value one? Their re­sults provide no rea­son to think that we can, and again my ini­tial im­pres­sion is that given their as­sump­tions we in fact can­not. In any case, we ul­ti­mately aim to max­i­mize the ex­pected value, not the prob­a­bil­ity of se­lect­ing the high­est-value op­tion. (My ini­tial im­pres­sion is that these two crite­ria always recom­mend the same op­tion at least in the sim­ple mod­els of Thorn­gate and Car­roll [1987] and Frank [2016], but in prin­ci­ple they could come apart.) But nei­ther do these re­sults provide a rea­son to think that de­vi­at­ing from the win­ning op­tion in­creases our de­ci­sion’s ex­pected value. In fact, Fig. 8(b) in Pluch­ino et al. (2018, p. 1850014-15) in­di­cates that se­lect­ing the win­ning op­tion has a higher ex­pected value than se­lect­ing an op­tion uniformly at ran­dom.

Ob­ser­va­tions of this type may well have in­ter­est­ing im­pli­ca­tions for de­ci­sion-mak­ing. For ex­am­ple, they might prompt us to in­crease the re­li­a­bil­ity of our mea­sure­ments, or to aban­don mea­sure­ments that cost more than the value of the in­for­ma­tion they provide. A lot­tery may then per­haps be a less costly al­ter­na­tive. How­ever, ab­sent such sec­ondary ad­van­tages, the claims re­viewed in the pre­ced­ing para­graphs don’t fa­vor lot­ter­ies over choos­ing the win­ning op­tion.

‘Matthew effects:’ Small differ­ences in in­puts (e.g. tal­ent) can pro­duce heavy-tailed out­comes (e.g. wealth). This is a com­mon ob­ser­va­tion not limited to the liter­a­ture I re­viewed. For ex­am­ple, Mer­ton (1968) fa­mously de­scribed this phe­nomenon in his study of the re­ward sys­tem of sci­ence. He at­tributed it to ‘rich get richer’ dy­nam­ics, for which he es­tab­lished the term ‘Matthew effects’.

Two pub­li­ca­tions in which I en­coun­tered such claims are Pluch­ino et al. (2018, p. 1850014-8, Fig. 3) and Den­rell and Liu (2012, sc. “Model 1: Ex­treme Perfor­mance Indi­cates Strong Rich-Get-Richer Dy­nam­ics”); they are also a cen­tral con­cern of Frank (2016). This list is not com­pre­hen­sive.

Frank (2016) pro­poses the in­creas­ing size of win­ner-takes-all mar­kets due to net­work effects as an ex­pla­na­tion. Pluch­ino and col­leagues’ (2018) model in­di­cates that heavy-tailed out­comes can re­sult when the im­pacts of ran­dom events on the out­come met­ric are pro­por­tional to its cur­rent value and ac­cu­mu­late over time. Out­side of the liter­a­ture I re­viewed, prefer­en­tial at­tach­ment pro­cesses have at­tracted a large amount of at­ten­tion. They as­sume that a ran­dom net­work grows in such a way that new nodes are more likely to be con­nected to those ex­ist­ing nodes with a larger num­ber of con­nec­tions, and show that this re­sults in a heavy-tailed dis­tri­bu­tion of the num­ber of con­nec­tions per node.

In our con­text, find­ings of this type in­di­cate that we can­not nec­es­sar­ily in­fer that the dis­tri­bu­tion of our op­tions’ true val­ues is heavy-tailed just based on heavy-tailed mea­sure­ments. This is be­cause the true val­ues of our op­tions might be the in­puts of a pro­cess as de­scribed above, while our mea­sure­ments might be the out­comes. How­ever, again, this doesn’t im­ply that the ex­pected value of choos­ing the win­ning op­tion is sub­op­ti­mal; in par­tic­u­lar, it does noth­ing to recom­mend lot­ter­ies. For this rea­son, I did not in more de­tail re­view pro­posed ex­pla­na­tions for why we might find heavy-tailed out­comes de­spite less varied in­puts.

Sum­mary. Loosely speak­ing, the re­sults re­viewed in this sub­sub­sec­tion may help with iden­ti­fy­ing con­di­tions un­der which sim­ple es­ti­ma­tion pro­ce­dures pro­duce ex­ag­ger­ated differ­ences be­tween the op­tions we choose from. How­ever, they don’t by them­selves sug­gest that these sim­ple de­ci­sion pro­ce­dures re­sult in the wrong rank­ing of op­tions. There­fore, they con­sti­tute no ar­gu­ment against choos­ing the seem­ingly best op­tion, and in par­tic­u­lar no ar­gu­ment for us­ing a lot­tery in­stead.

***Ad­van­tages of lot­ter­ies are based on crite­ria other than max­i­miz­ing ex­pected value, e.g. fair­ness.***

Some re­sults ac­tu­ally say or im­ply that lot­ter­ies do have cer­tain ad­van­tages over other de­ci­sion pro­ce­dures. How­ever, I’ll now set aside sev­eral types of such ad­van­tages that ap­peal to a crite­rion other than the one I’m in­ter­ested in, i.e. max­i­miz­ing ex-ante ex­pected value.

Fair­ness. The alleged fair­ness of lot­ter­ies, and un­fair­ness of ‘naively mer­i­to­cratic’ de­ci­sions, is a com­mon theme in the liter­a­ture. Boyce (1994, pp. 457f.) refers to pre­vi­ous work that tries to ex­plain the oc­cur­rence of lot­ter­ies by their alleged fair­ness. [7] Thorn­gate (1988) as­sumes that “the ul­ti­mate goal [...] is to provide a means of al­lo­cat­ing re­sources viewed as fair by all con­tes­tants” (ibid., p. 6) and de­scribes the use of perfor­mance tests un­der cer­tain con­di­tions as “breed­ing grounds of in­vidious se­lec­tion vi­ti­at­ing the prin­ci­ple of fair­ness that char­ac­ter­izes the con­tests in which they are em­ployed”. Pluch­ino et al. (2018, p. 1850014-17) men­tion the “un­fair fi­nal re­sult” of their model. Again, this list is not com­pre­hen­sive.

Such claims are hard to eval­u­ate for sev­eral rea­sons. It re­mains un­clear which spe­cific con­cep­tion of fair­ness they ap­peal to, whether they at­tribute fair­ness to pro­ce­dures or out­comes, or even whether they are nor­ma­tive claims as op­posed to de­scrip­tive claims about per­ceived fair­ness. For ex­am­ple, Pluch­ino et al. (2018, p. 1850014-8) seem to think that an out­come in which “the most suc­cess­ful peo­ple were the most tal­ented” would be fairer than an out­come vi­o­lat­ing that prop­erty; but nei­ther do they or ar­gue for this as­ser­tion nor do they say whether this crite­rion ex­hausts their con­cep­tion of fair­ness.

For ex­am­ples of a more ex­ten­sive dis­cus­sion that em­ploys spe­cific defi­ni­tions of fair­ness see Boyle (1988) and Elster (1989); Avin (2018, p. 9f.) refers to these defi­ni­tions as well.

As an aside, the re­la­tion­ship be­tween fair­ness and lot­ter­ies seems to be com­plex, with lot­ter­ies some­times be­ing per­ceived as de­cid­edly un­fair. See Hofs­tee (1990, p. 745) for an anec­dote where the use of a lot­tery in­voked anony­mous phone threats to a de­ci­sion maker.

In any case, I wasn’t in­ter­ested in pur­su­ing ques­tions around fair­ness be­cause they ten­ta­tively seem less rele­vant to me for the al­lo­ca­tion of EA fund­ing. Of course, EAs need not em­brace con­se­quen­tial­ism and thus could have rea­sons to abide by pro­ce­du­ral fair­ness con­straints; even con­se­quen­tial­ist EAs might in­trin­si­cally or in­stru­men­tally fa­vor fair out­comes. How­ever, it seems to me that con­cerns about fair­ness are more com­mon when de­ci­sions af­fect a large num­ber of peo­ple, when peo­ple can­not avoid be­ing af­fected by a de­ci­sion (or only at un­rea­son­able cost), or when de­ci­sion mak­ers are ac­countable to many stake­hold­ers with di­verse prefer­ences. In a similar vein, Avin (2018) notes that “[w]hile the drive for effi­ciency is of­ten in­ter­nal to the or­gani­sa­tion, there are of­ten ex­ter­nal drivers for fair­ness”. For ex­am­ple, we com­monly con­sider the fair­ness of tax­a­tion, so­cial welfare, and similar policy is­sues; ad­mis­sions to pub­lic schools; or de­ci­sions that al­lo­cate a sig­nifi­cant share of all available re­sources, such as in the case of ma­jor in­sti­tu­tional sci­ence fun­ders. By con­trast, I’d guess that most EA fund­ing is ei­ther al­lo­cated by per­sonal dis­cre­tion, or by or­ga­ni­za­tions whose fo­cus on effec­tive­ness is shared by key stake­hold­ers such as donors and em­ploy­ees.

Psy­cholog­i­cal im­pact on de­ci­sion mak­ers or peo­ple af­fected by de­ci­sions. For ex­am­ple, Thorn­gate (1988, p. 14) recom­mends lot­ter­ies partly on the grounds that they “might re­lieve ad­ju­di­ca­tors of re­spon­si­bil­ity for dis­tinc­tions they are in­ca­pable of mak­ing”, and that “[i]t might leave the losers with some pride, and the win­ners with some hu­mil­ity, if ev­ery­one knew that, in the end, chance and not merit sealed their fate”.

On the other hand, Elster (1989, pp. 105f.) is skep­ti­cal whether reap­ing such benefits from lot­ter­ies is fea­si­ble, and Pluch­ino et al. (2011a, p. 3509) men­tion “the pos­si­ble negative

psy­cholog­i­cal feed­back of em­ploy­ees to a de­nied and ex­pected pro­mo­tion” as an ob­jec­tion to ran­dom pro­mo­tions.

Such psy­cholog­i­cal effects may be rele­vant when al­lo­cat­ing EA fund­ing. How­ever, my guess is that they would be de­ci­sive only in rare cases. For this rea­son, I didn’t pur­sue this is­sue fur­ther and for now fo­cused on max­i­miz­ing first-or­der ex­pected value while ig­nor­ing these sec­ondary effects.

Other crite­ria I set aside. Elster (1989, p. 109) quips that “[w]hen con­sen­sus fails, we might as well use a lot­tery.” Why a lot­tery? Ac­cord­ing to Elster, lot­ter­ies are more salient, harder to ma­nipu­late [8], and avoid bad in­cen­tives. [9] [10]

Pluch­ino et al. (2018) use an agent-based model in­tended to quan­ti­ta­tively as­sess the im­pact of differ­ent fund­ing strate­gies, in­clud­ing lot­ter­ies, on the life­time suc­cess of in­di­vi­d­u­als. Per­haps mo­ti­vated by con­cerns around fair­ness, they com­pare strate­gies ac­cord­ing to “the av­er­age per­centage [...] of tal­ented peo­ple which, dur­ing their ca­reer, in­crease their ini­tial cap­i­tal/​suc­cess” (Pluch­ino et al., 2018, p. 1850014-20). [11] Note that this met­ric ig­nores the amounts by which the suc­cess of in­di­vi­d­u­als changes. It is there­fore a poor proxy for what I’d be most in­ter­ested in, i.e. the to­tal sum of suc­cess across all in­di­vi­d­u­als.

Limited ev­i­dence pro­vided by agent-based models

Many pub­li­ca­tions I re­viewed pre­sent agent-based mod­els, some­times as the only type of ev­i­dence. Un­for­tu­nately, I be­lieve we can con­clude lit­tle from these mod­els even for the cases they were in­tended to cap­ture.

To ex­plain why, I’ll first sum­ma­rize my reser­va­tions about the agent-based mod­els of Pluch­ino et al. (2010) on pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions. I be­lieve this ex­am­ple illu­mi­nates struc­tural short­com­ings of agent-based mod­els, which I’ll ex­plain next. In prin­ci­ple, these short­com­ings could be over­come by ex­ten­sive ex­per­i­ments and a care­ful dis­cus­sion. How­ever, I’ll go on to de­scribe why the pub­li­ca­tions I’ve re­viewed suc­ceed at this only to a limited ex­tent, of­ten com­ing back to Pluch­ino et al. (2010) and re­lated work as the most illus­tra­tive ex­am­ple.

The model of Pluch­ino et al. (2010) is based on im­plau­si­ble as­sump­tions, and re­gres­sion to the mean is the main ex­pla­na­tion of their re­sults. [12] Pluch­ino et al. (2010) as­sume [13] that the perfor­mance of all em­ploy­ees within an or­ga­ni­za­tion is de­ter­mined by in­de­pen­dent draws from the same dis­tri­bu­tion, with a new draw af­ter ev­ery pro­mo­tion. An em­ployee’s cur­rent perfor­mance is thus com­pletely un­in­for­ma­tive of their perfor­mance af­ter a pro­mo­tion. What is more, there are no in­ter­per­sonal differ­ences rele­vant to perfor­mance.

Put differ­ently and ig­nor­ing ar­guably ir­rele­vant de­tails of their model, they effec­tively as­sume an or­ga­ni­za­tion to be a col­lec­tion of in­de­pen­dent fair dies, with or­ga­ni­za­tional perfor­mance be­ing rep­re­sented by the sum of points shown. Pro­mo­tions cor­re­spond to rerol­ling in­di­vi­d­ual dies. What’s the best strat­egy if you want to quickly max­i­mize the sum of the dies? Due to re­gres­sion to the mean, it’s clearly best to reroll the dies with the low­est scores first. Con­versely, rerol­ling the dies with the high­est scores will re­sult in the largest pos­si­ble ex­pected de­crease in perfor­mance. Rerol­ling dies uniformly at ran­dom will be about halfway in be­tween these ex­tremes.

In other words, we can eas­ily ex­plain their main re­sult: Un­der the as­sump­tions de­scribed above, pro­mot­ing the best em­ploy­ees will dra­mat­i­cally harm or­ga­ni­za­tional perfor­mance; pro­mot­ing the worst em­ploy­ees will dra­mat­i­cally in­crease perfor­mance; and ran­dom pro­mo­tions will be about halfway in be­tween.

Un­for­tu­nately, as I’ll ar­gue be­low the key as­sump­tions be­hind this re­sult are im­plau­si­bly ex­treme. [14]

Ad-hoc con­struc­tion with­out an un­der­ly­ing the­ory. There seems to be no unify­ing the­ory be­hind the agent-based mod­els I’ve seen. They were made up ad hoc, some­times hap­haz­ardly draw­ing upon pre­vi­ous work in other fields. [15] It is there­fore hard to im­me­di­ately see whether they are based on sound as­sump­tions, and if so to what ex­tent their re­sults gen­er­al­ize. In­stead, we have to as­sess them on a case-by-case ba­sis.

Un­canny medium level of com­plex­ity. One use­ful way to study the world per­haps is to ex­am­ine a spe­cific situ­a­tion in a lot of de­tail. While find­ings ob­tained in this way may be hard to gen­er­al­ize, we can at least re­li­ably es­tab­lish what hap­pened at a par­tic­u­lar place and time.

Model­ing, by con­trast, is based on de­liber­ate sim­plifi­ca­tion and ab­strac­tion. When a model is based on a small num­ber of sim­ple as­sump­tions and pa­ram­e­ters, it is easy to ex­am­ine ques­tions such as: Which as­sump­tions are cru­cial for the de­rived re­sults, and why? Which types of real-world sys­tems, if any, are faith­fully mod­eled? How can we de­ter­mine ap­pro­pri­ate val­ues for the model pa­ram­e­ters based on em­piri­cal mea­sure­ments of such sys­tems?

For ex­am­ple, con­sider Smith and Win­kler’s (2006) work on the op­ti­mizer’s curse. Their set­ting is highly ab­stract, and their as­sump­tions par­si­mo­nious. Their re­sults thus ap­ply to many de­ci­sions made un­der un­cer­tainty. Their cen­tral re­sult is not only ob­served in simu­la­tions but can be crisply stated as a the­o­rem (ibid., p. 315, sc. 2.3). By ex­am­in­ing the proof we can un­der­stand why it holds. Over­all, their dis­cus­sion doesn’t stray far from ba­sic prob­a­bil­ity the­ory, which is a well-un­der­stood the­ory. This helps us un­der­stand whether their as­sump­tions are rea­son­able, and whether the mod­ifi­ca­tions they chose to ex­plore (e.g. ibid., p. 314, Table 2) are rele­vant. Similar re­marks ap­ply to Den­rell and Liu (2012).

Con­trast this with Pluch­ino and col­leagues’ (2010) agent-based model of pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions. It de­pends on the fol­low­ing as­sump­tions and pa­ram­e­ters, some of which are quite com­plex them­selves:

  • A di­rected graph rep­re­sent­ing the or­ga­ni­za­tional struc­ture, i.e. pos­si­ble pro­mo­tions from one po­si­tion to an­other.

  • Weights con­trol­ling to what ex­tent the perfor­mance in a po­si­tion af­fects or­ga­ni­za­tional perfor­mance.

  • The age at which em­ploy­ees re­tire.

  • A perfor­mance thresh­old for firing em­ploy­ees.

  • The dis­tri­bu­tions from which the de­grees of com­pe­tence and age of newly re­cruited em­ploy­ees are ini­tially drawn.

  • The dy­namic rules for up­dat­ing the age, perfor­mance lev­els, and po­si­tions (via pro­mo­tions) of em­ploy­ees.

Given this level of com­plex­ity and the lack of a the­o­ret­i­cal foun­da­tion, their simu­la­tion re­sults ini­tially ap­pear as a ‘brute fact’. We can’t im­me­di­ately see which as­sump­tions they de­pend on, and how sen­si­tive they might be to the val­ues of var­i­ous pa­ram­e­ters. As a con­se­quence, it is hard to see which real-world or­ga­ni­za­tions their re­sults may ap­ply to. Does my or­ga­ni­za­tion have to have ex­actly 160 em­ploy­ees and six lev­els of hi­er­ar­chy, as in their model? What about the fre­quency of pro­mo­tions rel­a­tive to the age of re­cruit­ment and re­tire­ment? Is the var­i­ance of perfor­mance be­tween po­si­tions rele­vant – and if so, how could I mea­sure its value? Etc.

I have ar­gued above that it is in fact pos­si­ble to eas­ily ex­plain their re­sults – they are due to re­gres­sion to the mean. Note, how­ever, that my anal­y­sis was based on re­mov­ing ar­guably ir­rele­vant com­plex­ity from the model; its many de­tails ob­struct rather than as­sist un­der­stand­ing. This prob­lem is illus­trated by Pluch­ino et al. (2010) never pre­sent­ing a similar anal­y­sis. In­stead, they’ll have to an­swer the above ques­tions by ad­di­tional ex­per­i­ments. While they partly suc­ceed in do­ing so, I’ll ar­gue that they ul­ti­mately fail to ad­dress the main prob­lem I iden­ti­fied ear­lier, i.e. their use of ar­guably im­plau­si­ble as­sump­tions.

Lack of em­piri­cal val­i­da­tion. By em­piri­cal val­i­da­tion I mean roughly stat­ing a cor­re­spon­dence rule be­tween both the in­puts (as­sump­tions and pa­ram­e­ters) and the out­puts of a model on one hand, and the real world on the other hand. If these rules are ap­pro­pri­ately op­er­a­tional­ized, we can then em­piri­cally test whether the in­put-out­put pairs of a model match the cor­re­spond­ing real-world prop­er­ties.

I be­lieve it is tel­ling that the only model I’ve seen that pre­sents some amount of quan­ti­ta­tive val­i­da­tion is the one by Smith and Win­kler (2006, pp. 314f.). By con­trast, all agent-based mod­els I’ve seen were val­i­dated only in a qual­i­ta­tive way, [16] if at all. [17] The prob­lem here is not just the ab­sence of rele­vant data, but that it’s un­clear pre­cisely which real-world phe­nom­ena (if any) the model is sup­posed to rep­re­sent. One ex­cep­tion is Har­nagel’s (2018) agent-based model of sci­ence fund­ing, which ex­tends Avin’s (2017) epistemic land­scape based on em­piri­cal biblio­met­ric data. [18]

Miss­ing or un­con­vinc­ing dis­cus­sion of as­sump­tions and pa­ram­e­ters. When there is no em­piri­cal val­i­da­tion, it would at least be de­sir­able to have a rough dis­cus­sion of a model’s as­sump­tion and de­fault pa­ram­e­ter val­ues. While ev­i­dence from such a dis­cus­sion would be weaker, it would at least partly illu­mi­nate how the model is meant to re­late to the real world, and could serve as a san­ity check of its as­sump­tion and de­fault pa­ram­e­ter val­ues.

Avin (2017) pro­vides the most ex­ten­sive dis­cus­sion of this type that I’ve seen; I’ll just point to a few ex­am­ples. First, he clar­ifies that the scalar value at­tached to each re­search pro­ject in his model is meant to rep­re­sent the amount of dis­cov­ered “truths that will even­tu­ally con­tribute in a mean­ingful way to wellbe­ing” (ibid., p. 5). To mo­ti­vate the se­lec­tion mechanism op­er­at­ing on the mod­eled pop­u­la­tion of sci­en­tists, he refers to em­piri­cal work in­di­cat­ing that “[n]ational fund­ing bod­ies sup­port much of con­tem­po­rary sci­ence” (ibid., p. 4). He also ac­knowl­edges spe­cific limi­ta­tions of his model, for ex­am­ple its land­scape’s low di­men­sion­al­ity (ibid., p. 8., es­pe­cially fn. 3). Cru­cially, he is able to defend his choice of a par­tic­u­lar pa­ram­e­ter, the size of the sci­en­tists’ field of vi­sion, based on pre­vi­ous em­piri­cal re­sults (ibid., p. 12). This is par­tic­u­larly im­por­tant as he later shows that the rel­a­tive perfor­mance of some fund­ing strate­gies flips de­pend­ing on that pa­ram­e­ter’s value (ibid., p. 20, Fig. 6).

As a less con­vinc­ing ex­am­ple, con­sider again Pluch­ino et al. (2010), which un­for­tu­nately I’ve found to be more typ­i­cal among the mod­els I ex­am­ined. They do provide con­text on their pa­ram­e­ters and as­sump­tions, for ex­am­ple when say­ing that their “de­gree of com­pe­tence” vari­able “in­cludes all the fea­tures (effi­ciency, pro­duc­tivity, care, dili­gence, abil­ity to ac­quire new skills) char­ac­ter­iz­ing the av­er­age perfor­mance of an agent in a given po­si­tion at a given level” (ibid., p. 468). How­ever, aside from their dis­cus­sion be­ing gen­er­ally less ex­ten­sive and not cov­er­ing all of their as­sump­tions and pa­ram­e­ters, I be­lieve their dis­cus­sion is un­con­vinc­ing at the most cru­cial point.

This prob­lem con­cerns what they call the Peter hy­poth­e­sis – the as­sump­tion that an em­ployee’s cur­rent perfor­mance is com­pletely un­in­for­ma­tive about his perfor­mance af­ter a pro­mo­tion. As I men­tioned ear­lier, they find that their ad­ver­tised re­sult – i.e., ran­dom pro­mo­tions out­perform­ing a scheme that pro­motes the best el­i­gible em­ployee – de­pends on that hy­poth­e­sis, and re­verts un­der an al­ter­na­tive as­sump­tion. How­ever, their only dis­cus­sion of the Peter hy­poth­e­sis is the fol­low­ing:

“Ac­tu­ally, the sim­ple ob­ser­va­tion that a new po­si­tion in the or­ga­ni­za­tion re­quires differ­ent work skills for effec­tively perform­ing the new task (of­ten com­pletely differ­ent from the pre­vi­ous one), could sug­gest that the com­pe­tence of a mem­ber at the new level could not be cor­re­lated to that at the old one.” (Pluch­ino et al., 2010, p. 467)

Similarly, in a later pub­li­ca­tion that ex­tends their model in an un­re­lated way, Sobkow­icz (2010) writes that:

“[As­sum­ing that perfor­mance af­ter a pro­mo­tion is in­de­pen­dent from pre-pro­mo­tion perfor­mance is] suit­able in situ­a­tions where the new post calls for to­tally differ­ent set of skills (sales­man pro­moted to sales man­ager or to mar­ket­ing man­ager po­si­tion).”

How­ever, these ar­gu­ments seem to show at most that past perfor­mance is not a perfect pre­dic­tor of fu­ture perfor­mance. But as­sum­ing no re­la­tion what­so­ever seems clearly ex­ces­sive, for the fol­low­ing two rea­sons. First, there’ll usu­ally at least be some over­lap be­tween tasks be­fore and af­ter a pro­mo­tion. Se­cond, perfor­mance cor­re­lates across differ­ent cog­ni­tive tasks, cap­tured by psy­cholog­i­cal con­structs such as gen­eral in­tel­li­gence that differ be­tween in­di­vi­d­u­als and aren’t af­fected by pro­mo­tions. (The ab­sence in their mod­els of such rele­vant in­ter­per­sonal differ­ences isn’t dis­cussed at all.)

Miss­ing or un­con­vinc­ing ab­la­tion and sen­si­tivity stud­ies. Even with­out a full the­o­ret­i­cal un­der­stand­ing, ad­di­tional ex­per­i­ments can help illu­mi­nate a model’s be­hav­ior. In par­tic­u­lar: When there are sev­eral as­sump­tions or rules con­trol­ling the model’s dy­nam­ics, what hap­pens if we re­move them one-by-one (ab­la­tion stud­ies)? And to what ex­tent are re­sults sen­si­tive to the val­ues of the in­put pa­ram­e­ters?

Again, the prob­lem I’ve found was not the com­plete ab­sence of such ex­per­i­ments, but that they of­ten failed to ad­dress the most rele­vant points. Con­sider again Pluch­ino et al. (2010). While they don’t provide de­tails, they at least as­sure us “that the nu­mer­i­cal re­sults that we found for such an or­ga­ni­za­tion are very ro­bust and show only a lit­tle de­pen­dence on the num­ber of lev­els or on the num­ber of agents per level (as long as it de­creases go­ing from the bot­tom to the top)” (ibid., p. 468), and “that all the re­sults pre­sented do not de­pend dras­ti­cally on small changes in the value of the free pa­ram­e­ters” (ibid., p. 469).

More im­por­tantly, they do show that their re­sult cru­cially de­pends on the Peter hy­poth­e­sis de­scribed above, and re­verts un­der an al­ter­na­tive as­sump­tion. This al­ter­na­tive as­sump­tion in­stead as­sumes that cur­rent perfor­mance is an un­bi­ased es­ti­mate of fu­ture perfor­mance, thus elimi­nat­ing the re­gres­sion to the mean effect I de­scribed above.

Pluch­ino et al. (2010) thus pre­sented op­pos­ing re­sults for two al­ter­na­tive as­sump­tions, which both are im­plau­si­bly ex­treme. It would there­fore have been par­tic­u­larly im­por­tant to in­ves­ti­gate the model’s be­hav­ior un­der more re­al­is­tic in­ter­me­di­ate as­sump­tions. Un­for­tu­nately, ev­ery­thing they do in that di­rec­tion seems be­sides the point. Rather than con­sid­er­ing in­ter­me­di­ate as­sump­tions, they dis­cuss the case where it isn’t known which of two as­sump­tions hold (ibid., p. 470); [19] mix­ing the strate­gies of pro­mot­ing the best and worst em­ploy­ees (ibid., p. 470, Fig. 3); and in later work the case where some pro­mo­tions are de­scribed by one and oth­ers by the other as­sump­tion (Pluch­ino et. al, 2011a, p. 3506, Fig. 10).

In­stead, they ex­per­i­men­tally in­ves­ti­gate about ev­ery other pos­si­ble vari­a­tion in their model, par­tic­u­larly in their fol­low-up work (Pluch­ino et al., 2011a). For ex­am­ple, they change how of­ten perfor­mance is up­dated rel­a­tive to age, re­place the sim­ple pyra­mi­dal or­ga­ni­za­tional struc­ture from their 2010 pa­per with more com­pli­cated trees, vary the weights with which in­di­vi­d­ual po­si­tions con­tribute to or­ga­ni­za­tional perfor­mance, eval­u­ate perfor­mance rel­a­tive to the steady state un­der a baseline strat­egy rather than rel­a­tive to the ini­tial perfor­mance draw, and in­tro­duce age-de­pen­dent changes in perfor­mance. Of course, if my origi­nal anal­y­sis is cor­rect, it’s not sur­pris­ing that the rel­a­tive perfor­mance of the ran­dom strat­egy is ro­bust to all of these vari­a­tions.

Again, Avin (2017) is the best ex­am­ple of do­ing this right (ibid., pp. 18-27, sc. 5). How­ever, even here im­por­tant open ques­tions re­main, such as whether the re­sults de­pend on as­sum­ing a land­scape with just two di­men­sions, an as­sump­tions ac­knowl­edged to be prob­le­matic.

Lot­ter­ies are com­pared against sub­op­ti­mal alternatives

The in my view most egre­gious prob­lem with Pluch­ino and col­leagues’ (2011a) fol­low-up work on pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions is illus­tra­tive of yet an­other com­mon is­sue. This is that lot­ter­ies are rarely shown to be op­ti­mal among a rich set of al­ter­na­tives, let alone all pos­si­ble strate­gies. In­stead, they are com­pared to a small num­ber of al­ter­na­tives, which in some cases clearly don’t in­clude the best one.

Re­call that Pluch­ino et al. (2010) found that if pre-pro­mo­tion perfor­mance is com­pletely un­in­for­ma­tive of post-pro­mo­tion perfor­mance, then pro­mot­ing the worst em­ploy­ees dra­mat­i­cally out­performs both pro­mot­ing the best and ran­dom pro­mo­tions. How­ever, this win­ning strat­egy of pro­mot­ing the worst is com­pletely ab­sent from their fol­low-up pa­per (Pluch­ino et al., 2011a). There they just com­pare ran­dom pro­mo­tions with pro­mot­ing the best, which as I ar­gued ear­lier is the worst pos­si­ble strat­egy given the Peter hy­poth­e­sis. Put differ­ently, they ex­hibit pre­cisely those re­sults which are least in­for­ma­tive about the perfor­mance of ran­dom pro­mo­tions. It is, for ex­am­ple, not sur­pris­ing that no mix­ture of these two strate­gies out­performs com­pletely ran­dom pro­mo­tions (ibid., p. 3503, Fig. 7).

Similarly, in Figure 10 of Pluch­ino et al. (2018, p. 1850014-20) we see that in their simu­la­tions, at least for small amounts of to­tal fund­ing per round, giv­ing the same small amount of fund­ing to ev­ery­one worked even bet­ter than fund­ing by lot­tery. Such egal­i­tar­ian fund­ing schemes also out­performed ran­dom ones at the task of dis­tribut­ing a fixed amount of funds (Pluch­ino et al., 2018, p. 1850014-23, Fig. 12).

Lastly, Avin (2017) eval­u­ates lot­ter­ies only against strate­gies which are max­i­mally short-sighted or ex­plore as lit­tle as pos­si­ble. It re­mains an open ques­tions if one could con­struct an even bet­ter non­ran­dom strat­egy.

Un­con­vinc­ing ar­gu­ments and references

Espe­cially in the pub­li­ca­tions by Biondo et al. and Pluch­ino et al., I en­coun­tered sev­eral other pas­sages I didn’t find con­vinc­ing. They were of­ten suffi­ciently vague that it’s hard to con­clu­sively demon­strate a mis­take. They wouldn’t di­rectly in­val­i­date ad­ver­tised find­ings about the perfor­mance of ran­dom strate­gies if false, but still make me more wary of tak­ing such find­ings at face value.

***Mi­sun­der­stand­ings and omis­sions in refer­ences to Si­na­tra and col­leagues’ (2016) model of sci­en­tific ca­reers***

Sum­mary of Si­na­tra et al. (2016). In a pa­per pub­lished in Science, Si­na­tra et al. (2016) an­a­lyze the pub­li­ca­tion records of a large sam­ple [20] of sci­en­tists. They find “two fun­da­men­tal char­ac­ter­is­tics of a sci­en­tific ca­reer” (ibid., p. 596), both of which are promi­nently men­tioned in their pa­per’s sum­mary and ab­stract.

Their first re­sult is a “ran­dom-im­pact rule” (ibid., p. 596). It says that the “high­est-im­pact work can be, with the same prob­a­bil­ity, any­where in the se­quence of pa­pers pub­lished by a sci­en­tist” (ibid., p. 596). As a mea­sure of a pub­li­ca­tion’s im­pact, they use its num­ber of cita­tions af­ter 10 years.

Note that this re­sult con­cerns the dis­tri­bu­tion of im­pact within a given sci­en­tist’s ca­reer. In the con­text of fund­ing de­ci­sions, we’d be more in­ter­ested in differ­ences be­tween sci­en­tists.

Their sec­ond main re­sult ad­dresses just such differ­ences. Speci­fi­cally, their fa­vored model con­tains a pa­ram­e­ter Q, which they de­scribe as the “sus­tained abil­ity to pub­lish high-im­pact pa­pers” (p. 596). Cru­cially, Q is con­stant within each ca­reer, but differs be­tween sci­en­tists. The higher a sci­en­tist’s Q the larger the ex­pected value of their next pa­per’s im­pact. Similarly, if we com­pare sci­en­tists with the same num­ber of pub­li­ca­tions then higher-Q in­di­vi­d­u­als will likely have a larger to­tal im­pact. [21]

More­over, Si­na­tra et al. (2016, pp. aaf5239-4f.) de­scribe how a sci­en­tist’s Q can be re­li­ably mea­sured early in their ca­reer. Over­all, they con­clude that “[b]y de­ter­min­ing the value of Q dur­ing the early stages of a sci­en­tific ca­reer, we can use it to pre­dict fu­ture ca­reer im­pact.” (ibid., p. aaf5239-5) [22]

Note that Si­na­tra et al. (2016, pp. aaf5239-2f.) did con­sider, but statis­ti­cally re­ject, an al­ter­na­tive “ran­dom-im­pact model” with­out the pa­ram­e­ter Q, i.e. as­sum­ing no in­ter­per­sonal differ­ences in abil­ities.

My take­away. Si­na­tra et al. (2016) dis­en­tan­gle the roles of luck, pro­duc­tivity, and abil­ity [23] in sci­en­tific ca­reers (cf. ibid., p. aaf5239-6). They find that the role of luck is con­sid­er­able but limited. If their anal­y­sis is sound, there are per­sis­tent differ­ences be­tween sci­en­tists’ abil­ities to pub­lish much-cited pa­pers, which can be re­li­ably es­ti­mated based on their track records. If we were in­ter­ested in max­i­miz­ing cita­tions, this might sug­gest to fund the sci­en­tists with high­est es­ti­mated abil­ity. In no way do their re­sults fa­vor fund­ing by lot­tery. If any­thing, they sug­gest that we could find a re­li­able and valid ‘mer­i­to­cratic’ strat­egy de­spite con­sid­er­able noise in the available data.

Refer­ences to Si­na­tra et al. (2016) el­se­where. I en­coun­tered sev­eral refer­ences to Si­na­tra et al. (2016) that only men­tioned, with vary­ing amounts of clar­ity, their find­ing that im­pact is ran­domly dis­tributed within a ca­reer. Th­ese refer­ences were gen­er­ally made to sup­port claims around the large role of luck or the good perfor­mance of ran­dom­ized strate­gies. Failing to men­tion the find­ing about differ­ences in abil­ity be­tween sci­en­tists in this con­text strikes me as a rele­vant omis­sion.

Con­sider for ex­am­ple:

“Scien­tific im­pact is ran­domly dis­tributed, with high pro­duc­tivity alone hav­ing a limited effect on the like­li­hood of high-im­pact work in a sci­en­tific ca­reer.” (Beau­tiful Minds blog @ Scien­tific Amer­i­can)
“Ac­tu­ally, such con­clu­sions [about diminish­ing marginal re­turns of re­search fund­ing found by other pa­pers] should not be a sur­prise in the light of the other re­cent find­ing [Si­na­tra et al., 2016] that im­pact, as mea­sured by in­fluen­tial pub­li­ca­tions, is ran­domly dis­tributed within a sci­en­tist’s tem­po­ral se­quence of pub­li­ca­tions. In other words, if luck mat­ters, and if it mat­ters more than we are will­ing to ad­mit, it is not strange that mer­i­to­cratic strate­gies re­veal less effec­tive than ex­pected, in par­tic­u­lar if we try to eval­u­ate merit ex-post.“ (Pluch­ino et al., 2018, p. 1850014-18)

The first quote is from a list of find­ings taken to sup­port the claim that “we miss out on a re­ally im­por­tance [sic] piece of the suc­cess pic­ture if we only fo­cus on per­sonal char­ac­ter­is­tics in at­tempt­ing to un­der­stand the de­ter­mi­nants of suc­cess”. Now it is true that Si­na­tra et al. (2016) find the im­pact of a given sci­en­tist’s next work to be con­sid­er­ably in­fluenced by luck. With re­spect to the “like­li­hood of high-im­pact work in a sci­en­tific ca­reer”, it is true that they find the effect of pro­duc­tivity to be limited. How­ever, they also find that the afore­men­tioned like­li­hood in fact is to a con­sid­er­able ex­tent de­ter­mined by differ­ences in “per­sonal char­ac­ter­is­tics”, i.e. their pa­ram­e­ter Q.

The sec­ond quote is from a dis­cus­sion of whether it is “more effec­tive to give large grants to a few ap­par­ently ex­cel­lent re­searchers, or small grants to many more ap­par­ently or­di­nary re­searchers” (Pluch­ino et al., 2018, p. 1850014-18). How­ever, I fail to see how Si­na­tra and col­leagues’ re­sults are re­lated to find­ings about diminish­ing marginal re­turns of re­search fund­ing at all. [24] In­deed, they didn’t con­sider any fund­ing data.

***Sweep­ing gen­er­al­iza­tions and dis­cus­sions that are vague or du­bi­ous***

Biondo and col­leagues’ dis­cus­sion of the Effi­cient Mar­ket Hy­poth­e­sis. For ex­am­ple, I’m puz­zled by Biondo and col­leagues’ (2013a) dis­cus­sion of the Effi­cient Mar­ket Hy­poth­e­sis:

“We can roughly say that two main refer­ence mod­els of ex­pec­ta­tions have been widely es­tab­lished within eco­nomic the­ory: the adap­tive ex­pec­ta­tions model and the ra­tio­nal ex­pec­ta­tion model. [...]
Whereas, the ra­tio­nal ex­pec­ta­tions ap­proach [...] as­sumes that agents know ex­actly the en­tire model de­scribing the eco­nomic sys­tem and, since they are en­dowed by perfect in­for­ma­tion, their fore­cast for any vari­able co­in­cides with the ob­jec­tive pre­dic­tion pro­vided by the­ory. [...]
The so-called Effi­cient Mar­ket Hy­poth­e­sis, which refers to the ra­tio­nal ex­pec­ta­tion Models [...]
Ra­tional ex­pec­ta­tions the­o­rists would im­me­di­ately bet that the ran­dom strat­egy will eas­ily loose [sic] the com­pe­ti­tion [...]” (Biondo et al., 2013a, pp. 608f.)
“Thus, it is the­o­ret­i­cally con­se­quent that, if the Effi­cient Mar­kets Hy­poth­e­sis held, the fi­nan­cial mar­kets would re­sult com­plete, effi­cient and perfectly com­pet­i­tive. This im­plies that, in pres­ence of com­plete in­for­ma­tion, ran­dom­ness should play no role, since the Effi­cient Mar­ket Hy­poth­e­sis would gen­er­ate a perfect trad­ing strat­egy, able to pre­dict ex­actly the mar­ket val­ues, em­bed­ding all the in­for­ma­tion about short and long po­si­tions wor­ld­wide.” (Biondo et al., 2013a, p. 615)

My un­der­stand­ing is that the trad­ing strat­egy sug­gested by the Effi­cient Mar­ket Hy­poth­e­sis pre­cisely is the ran­dom strat­egy vin­di­cated by Biondo and col­leagues’ em­piri­cal anal­y­sis. It’s hard to be cer­tain, but the above dis­cus­sion seems to sug­gest they think the op­po­site.

A puz­zling re­mark on large or­ga­ni­za­tions. In the con­text of dis­cussing psy­cholog­i­cal effects ig­nored by their model, Pluch­ino et al. (2011a, p. 3509) as­sert that “in a very big com­pany it is very likely that the em­ploy­ees com­pletely ig­nore the pro­mo­tion strate­gies of their man­agers”. While they don’t seem to rely on that as­ser­tion in that dis­cus­sion or any­where else, it still seems bizarre to me, and I have no idea why they think this is the case.

Sweep­ing gen­er­al­iza­tions based on the su­perfi­cial similar­ity that sev­eral find­ings in­volve ran­dom­ness or luck. For ex­am­ple, Pluch­ino et al. (2011b, p. 3944.) claim that their find­ing about ran­domly se­lect­ing mem­bers of par­li­a­ment “is in line with the pos­i­tive role which ran­dom noise plays of­ten in na­ture and in par­tic­u­lar in phys­i­cal sys­tems [...]. On the other hand, it goes also in the same di­rec­tion of the re­cent dis­cov­ery [...] that, un­der cer­tain con­di­tions, the adop­tion of ran­dom pro­mo­tion strate­gies im­proves the effi­ciency of hu­man hi­er­ar­chi­cal or­ga­ni­za­tions [...].”

Biondo et al. (2013a, p. 607) are even more far-reach­ing in their in­tro­duc­tion:

“In fact there are many ex­am­ples where ran­dom­ness has been proven to be ex­tremely use­ful and benefi­cial. The use of ran­dom num­bers in sci­ence is very well known and Monte Carlo meth­ods are very much used since long time [...].”

While none of these claims is un­am­bigu­ously false, I find their rele­vance du­bi­ous. Both the crite­ria ac­cord­ing to which we make judg­ments such as “use­ful and benefi­cial” and the role played by ran­dom­ness seem to vary dras­ti­cally be­tween the ex­am­ples ap­pealed to here.

In par­tic­u­lar, it doesn’t seem to me that the good perfor­mance of ran­dom­iza­tion in Pluch­ino et al. (2011b) is at all re­lated to their pre­vi­ous work on pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions, which uses a very differ­ent model. In­deed, I’ve ar­gued that the key ex­pla­na­tion for Pluch­ino and col­leagues’ (2010, 2011a) re­sults on pro­mo­tions sim­ply is re­gres­sion to the mean; by con­trast, I don’t think re­gres­sion to the mean plays a role in Pluch­ino et al. (2011b). Similarly, I sus­pect that Biondo and col­leagues’ (2013a, 2013b) re­sults are ex­plained by the effi­ciency of fi­nan­cial mar­kets, while a similar ex­pla­na­tion isn’t ap­pli­ca­ble to these other cases.

How my nega­tive con­clu­sions follow

I ear­lier said that the liter­a­ture:

  1. Doesn’t by it­self es­tab­lish that a benefi­cial po­ten­tial of lot­ter­ies is com­mon, or that the benefi­cial effect would be large for some par­tic­u­lar EA fund­ing de­ci­sion.

  2. Doesn’t provide a method to de­ter­mine the perfor­mance of lot­ter­ies that would be eas­ily ap­pli­ca­ble to any spe­cific EA fund­ing de­ci­sion.

  3. Over­all sug­gests that the case for lot­ter­ies is strongest in situ­a­tions that are most similar to in­sti­tu­tional sci­ence fund­ing. How­ever, even in such cases it re­mains un­clear whether lot­ter­ies are strictly op­ti­mal.

So far I’ve set aside work that doesn’t ac­tu­ally vin­di­cate lot­ter­ies, or does so only ac­cord­ing to crite­ria other than max­i­miz­ing the ex­pected value of some quan­tity of in­ter­est prior to a de­ci­sion. While some rele­vant claims re­main, they of­ten rely on agent-based mod­els, which I’ve ar­gued aren’t by them­selves strong ev­i­dence about the perfor­mance of lot­ter­ies in any real-world situ­a­tion. In any case, they at most provide rea­sons to think that lot­ter­ies do bet­ter than the spe­cific al­ter­na­tives they are com­pared against, not that lot­ter­ies are op­ti­mal among all de­ci­sion strate­gies. Fi­nally, some pub­li­ca­tions con­tain du­bi­ous claims that might war­rant cau­tion against tak­ing their re­sults at face value.

In sum­mary, I think that for most re­sults I’ve seen it’s un­clear whether they ap­ply in any real-world situ­a­tion, or even how one would go about ver­ify­ing that they do. In par­tic­u­lar, we can­not con­clude any­thing about ac­tual fund­ing de­ci­sions faced by EAs, hence con­clu­sions 1. and 2. fol­low.

Con­clu­sion 3. is based on me over­all be­ing some­what sym­pa­thetic to Avin’s (2015, 2017, 2018) case for fund­ing sci­ence by lot­tery. I’ve ex­plained above why I be­lieve this work suc­ceeds at over­com­ing the prob­lems af­fect­ing agent-based mod­els to a greater ex­tent. How­ever, my im­pres­sion is mostly based on Avin also uti­liz­ing sev­eral other types of ev­i­dence that demon­strate short­com­ings of the cur­rent grant peer re­view sys­tem (see es­pe­cially Avin, 2018, pp. 2-8, sc. 2).

Pos­i­tive con­clu­sions that remain

I be­lieve the liter­a­ture I re­viewed still sup­ports the con­clu­sion that:

  1. It’s con­cep­tu­ally pos­si­ble that de­cid­ing by lot­tery can have a strictly larger ex­pected ex-ante value than de­cid­ing by some non­ran­dom strate­gies, even when the lat­ter aren’t as­sumed to be ob­vi­ously bad or to have high cost.

Con­sider for ex­am­ple Pluch­ino and col­leagues’ (2010, 2011a) work on pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions, which I harshly crit­i­cized above. My crit­i­cism can at most show that their re­sults won’t ap­ply to any real-world or­ga­ni­za­tion. How­ever, their model still shows that it’s pos­si­ble – even if per­haps only un­der un­re­al­is­tic as­sump­tions – that ran­dom pro­mo­tions can out­perform the non­ran­dom strat­egy of pro­mot­ing the best em­ploy­ees. If my anal­y­sis is cor­rect, the lat­ter strat­egy is bad – in­deed, the worst pos­si­ble strat­egy given their as­sump­tions –, but it might not qual­ify as “ob­vi­ously bad” (even given their as­sump­tions). This is even more clearly the case for Avin’s (2017) model of sci­ence fund­ing, which I’ve ar­gued also is less af­fected by other prob­lems I listed; he tests lot­ter­ies against strate­gies that seem rea­son­able, and in­deed out­perform lot­ter­ies for some pa­ram­e­ter val­ues.

None of these re­sults are due to lot­ter­ies be­ing as­sumed to be less costly. In­deed, none of the mod­els I’ve seen at­tach any cost to any de­ci­sion pro­ce­dure.

I’ve also said that the liter­a­ture:

  1. Pro­vides some qual­i­ta­tive sug­ges­tions for con­di­tions un­der which lot­ter­ies might have this effect.

In brief, it seems that the good perfor­mance of lot­ter­ies in the mod­els I’ve seen is due to one of:

  • The available op­tions be­ing equally good [25] in ex­pec­ta­tion (Pluch­ino et al., 2010, 2011a; Biondo et al., 2013a, 2013b), per­haps be­cause they have been se­lected for this by an effi­cient mar­ket.

  • Lot­ter­ies en­sur­ing more ex­plo­ra­tion than some al­ter­na­tives, thus pro­vid­ing a bet­ter solu­tion to the ex­plo­ra­tion vs. ex­ploita­tion trade-off (Avin, 2017).

  • The re­la­tion be­tween our cur­rent value es­ti­mates and the true value of the even­tual out­come be­ing af­fected by dy­namic effects that are hard to take into ac­count at de­ci­sion time (Avin, 2017).

Ad­di­tional em­piri­cal ev­i­dence – as op­posed to simu­la­tion re­sults – also sug­gests that:

  • Lot­ter­ies can pre­vent bi­ases such as anti-nov­elty bias (Avin, 2018, p. 11, fn. 10), which might nega­tively af­fect some al­ter­na­tives.

  • Lot­ter­ies en­sure a spread of fund­ing, which can be bet­ter than more con­cen­trated fund­ing due to diminish­ing marginal re­turns (Fortin and Cur­rie, 2013; Mon­geon et al., 2016; Wahls, 2018)

  • Lot­ter­ies can be less costly for both de­ci­sion mak­ers and grantees (Avin, 2018, pp. 6-8, sc. 2.2).

  • Lot­ter­ies may be a vi­able com­pro­mise when stake­hold­ers dis­agree about how a de­ci­sion should be made (Elster, 1989, p. 109).

My over­all take on al­lo­cat­ing EA fund­ing by lottery

Sum­mary. I’m highly un­cer­tain whether one could im­prove, say, EA Grants or Open Phil’s grant­mak­ing by in­tro­duc­ing some ran­dom el­e­ment. How­ever, I’m now rea­son­ably con­fi­dent that the way to in­ves­ti­gate this would be “bot­tom-up” rather than “top-down” – i.e. ex­am­in­ing the speci­fics of a par­tic­u­lar use case rather than ask­ing when lot­ter­ies can be op­ti­mal un­der ideal­ized con­di­tions. Speci­fi­cally, lot­ter­ies may be op­ti­mal in cases where ex­plic­itly ac­count­ing for the in­fluence on an in­di­vi­d­ual de­ci­sions on fu­ture de­ci­sions is pro­hibitively costly.

The gen­eral case for why lot­ter­ies might be op­ti­mal in prac­tice. [26] When one de­ci­sion af­fects oth­ers, op­ti­miz­ing each de­ci­sion in iso­la­tion may not lead to the over­all best col­lec­tion of de­ci­sions. It can be best to make an in­di­vi­d­u­ally sub­op­ti­mal de­ci­sion when this en­ables bet­ter de­ci­sions in the fu­ture. Lot­ter­ies turn out to some­times have this effect, for ex­am­ple be­cause they ex­plore more and thus provide more in­for­ma­tion for fu­ture de­ci­sions.

Ideal­ized agents with ac­cess to all rele­vant in­for­ma­tion may be able to ex­plic­itly ac­count for and op­ti­mize effects on other de­ci­sions. They may thus find a fine-tuned non­ran­dom strat­egy that beats a lot­tery. For ex­am­ple, rather than ex­plor­ing ran­domly, they could calcu­late just when and how to ex­plore to max­i­mize the value of in­for­ma­tion.

How­ever, this ex­plicit ac­count­ing may be pro­hibitively costly in prac­tice, and re­quired data may not be available. For ex­am­ple, it’s im­pos­si­ble to fully ac­count for diminish­ing marginal re­turns of fund­ing with­out know­ing the de­ci­sions of other fun­ders. In ad­di­tion, even when in­for­ma­tion is available its use may be af­flicted by com­mon bi­ases. For these rea­sons, it may in prac­tice be in­fea­si­ble to im­prove on a lot­tery.

Gen­eral im­pli­ca­tions for the use of lot­ter­ies. De­ci­sion-mak­ers should con­sider lot­ter­ies when they make a col­lec­tion of re­lated de­ci­sions where the benefi­cial effects of lot­ter­ies are rele­vant. A good strat­egy may then be to:

  1. Assess how an in­di­vi­d­ual de­ci­sion in­fluences fu­ture de­ci­sions.

  2. Es­ti­mate how costly it’d be to ex­plic­itly ac­count for these in­fluences.

  3. Based on this, try to im­prove on lot­ter­ies, and to use them if and only if these efforts fail or prove too costly.

This will de­pend on the speci­fics of a use case.

Avin (per­sonal com­mu­ni­ca­tion) has sug­gested that we are less likely to beat a lot­tery if at the time of mak­ing a de­ci­sion:

  • We don’t know the de­ci­sions by other rele­vant ac­tors.

  • We an­ti­ci­pate long feed­back loops.

  • We face dis­agree­ment or deep un­cer­tainty about how to eval­u­ate our op­tions.

Im­pli­ca­tions for the use of lot­ter­ies in EA fund­ing. As I ex­plained, my liter­a­ture re­view leaves open the ques­tion if a lot­tery would im­prove any par­tic­u­lar EA fund­ing de­ci­sion. This has to be as­sessed on a case-by-case ba­sis, but my re­view doesn’t provide much guidance for this. In­stead, I sug­gest fol­low­ing the strat­egy out­lined above for the gen­eral case.

When tempted by a lot­tery, con­sider alternatives

Lot­ter­ies are an eas­ily us­able bench­mark against which other de­ci­sion strate­gies can be tested, ei­ther in a real-world ex­per­i­ment or a simu­la­tion.

What if a lot­tery out­performs a con­tender? The best re­ac­tion may be to un­der­stand why, and to use this in­sight to con­struct an even bet­ter third strat­egy.

In this way, lot­ter­ies could be used as a tool for an­a­lyz­ing and im­prov­ing de­ci­sions, even in cases where they aren’t the best de­ci­sion pro­ce­dure, all things con­sid­ered.

As an ex­am­ple, we might re­al­ize that one ad­van­tage of sci­ence fund­ing by lot­ter­ies is to avoid cost for ap­pli­cants. Some­times there may be de­ci­sion mechanisms that have the same ad­van­tage with­out us­ing a lot­tery. For in­stance, Open Phil’s ‘Se­cond Chance’ Pro­gram con­sid­ered ap­pli­ca­tions that had already been writ­ten for an­other pro­gram, and thus im­posed al­most zero marginal cost on ap­pli­cants. [27]

Similarly, there may be other ways to en­sure ex­plo­ra­tion, re­duce cost, or reap the other benefits of lot­ter­ies I listed above.

How­ever, in view of the ob­sta­cles men­tioned ear­lier, it may in prac­tice not be pos­si­ble to im­prove on a lot­tery.

Av­enues for fur­ther research

Based on this pro­ject, I see four av­enues for fur­ther re­search. While I’m not very ex­cited about any of them, it’s plau­si­ble to me that it might be worth for some­one to pur­sue them.

  • Re­search in­spired by Den­rell and Liu (2012, Model 2):

    • Are there EA situ­a­tions – not nec­es­sar­ily re­lated to fund­ing – where our es­ti­mates’ re­li­a­bil­ity varies a lot be­tween op­tions?

    • This could for ex­am­ple be the case when com­par­ing work across cause ar­eas, or syn­the­siz­ing differ­ent types of ev­i­dence.

    • There would then be rea­sons not to choose the op­tion with high­est es­ti­mated value, as demon­strated by Den­rell and Liu.

    • How­ever, this is a rel­a­tively easy-to-find ex­ten­sion of the op­ti­mizer’s curse (Smith and Win­kler, 2006), and also broadly re­lated to sev­eral blog posts by Holden Karnofsky and oth­ers such as this one, all of which have re­ceived sig­nifi­cant at­ten­tion in the EA com­mu­nity. I’d there­fore guess that most eas­ily ap­pli­ca­ble benefits from be­ing aware of such effects have already been reaped.

  • In­ves­ti­gate to what ex­tent Avin’s (2015, 2017, 2018) case for fund­ing sci­ence by lot­tery ap­plies to those EA fund­ing de­ci­sions which are most similar to in­sti­tu­tional sci­ence fund­ing, i.e. per­haps Open Phil’s sci­ence grants.

    • Note that Open Phil’s share of fund­ing may in some cause ar­eas be com­pa­rable to the one of large in­sti­tu­tional sci­ence fun­ders for sci­ence.

    • In prin­ci­ple, the dy­namic effects de­scribed by Avin (2017) also af­fect sci­ence funded by Open Phil.

    • I ex­pect do­ing this well would re­quire ac­cess to data spe­cific to the fund­ing situ­a­tion, which might not be available.

    • I’d guess this would be more likely to iden­tify spe­cific im­prove­ments to cur­rent fund­ing strate­gies than to ac­tu­ally recom­mend a lot­tery.

  • Pick a spe­cific EA fund­ing de­ci­sion (e.g. EA Grants or dona­tions by an in­di­vi­d­ual) for which rele­vant in­for­ma­tion is available and as­sess how hard the ob­sta­cles to ex­plic­itly ac­count­ing for in­di­rect effects are.

  • Adapt Avin’s (2017) epistemic land­scape model to EA fund­ing:

    • Po­ten­tial of nega­tive im­pact.

    • Heavy-tailed im­pacts.

    • New dy­namic effect rep­re­sent­ing cru­cial con­sid­er­a­tions that can dra­mat­i­cally change im­pact es­ti­mates in­clud­ing flip­ping their sign.

    • I started in­tro­duc­ing some of these effects and ran some pre­limi­nary ex­per­i­ments. The code is available on re­quest, but start­ing from Avin’s ver­sion may be bet­ter since I made only few and poorly com­mented changes.

    • One way to em­piri­cally an­chor such a model would be to look at how GiveWell’s cost-effec­tive­ness es­ti­mates have changed over time.

    • Among the three re­search di­rec­tions men­tioned here, I’m least ex­cited about this one. This is be­cause I think such a model would provide limited prac­ti­cal guidance by it­self, for rea­sons similar to the ones dis­cussed above.


I did this work as part of a 6-week Sum­mer Re­search Fel­low­ship at the Cen­tre for Effec­tive Altru­ism. (How­ever, I spent only the equiv­a­lent of ~3 weeks on this pro­ject as I was work­ing on an­other one in par­allel.) I thank Sha­har Avin for a helpful con­ver­sa­tion at an early stage of this pro­ject and feed­back on this post, as well as Sam Clarke, Max Dal­ton, and Jo­hannes Treut­lein for com­ments on notes that served as a start­ing point for writ­ing this post.


Avin, S., 2015. Fund­ing sci­ence by lot­tery. In Re­cent Devel­op­ments in the Philos­o­phy of Science: EPSA13 Helsinki (pp. 111-126). Springer, Cham.

Avin, S., 2017. Cen­tral­ized Fund­ing and Epistemic Ex­plo­ra­tion. The Bri­tish Jour­nal for the Philos­o­phy of Science.

Avin, S., 2018. Policy Con­sid­er­a­tions for Ran­dom Allo­ca­tion of Re­search Funds. In RT. A Jour­nal on Re­search Policy and Eval­u­a­tion, 6(1).

Bar­nett, W.P., 2008. The red queen among or­ga­ni­za­tions: How com­pet­i­tive­ness evolves. Prince­ton Univer­sity Press.

Biondo, A.E., Pluch­ino, A. and Rapisarda, A., 2013a. The benefi­cial role of ran­dom strate­gies in so­cial and fi­nan­cial sys­tems. Jour­nal of Statis­ti­cal Physics, 151(3-4), pp.607-622. [arXiv preprint]

Biondo, A.E., Pluch­ino, A., Rapisarda, A. and Helbing, D., 2013b. Are ran­dom trad­ing strate­gies more suc­cess­ful than tech­ni­cal ones?. PloS one, 8(7), p.e68344.

Boyce, J.R., 1994. Allo­ca­tion of goods by lot­tery. Eco­nomic in­quiry, 32(3), pp.457-476.

Boyle, C., 1998. Or­ga­ni­za­tions se­lect­ing peo­ple: how the pro­cess could be made fairer by the ap­pro­pri­ate use of lot­ter­ies. Jour­nal of the Royal Statis­ti­cal So­ciety: Series D (The Statis­ti­cian), 47(2), pp.291-321.

Den­rell, J. and Liu, C., 2012. Top perform­ers are not the most im­pres­sive when ex­treme perfor­mance in­di­cates un­re­li­a­bil­ity. Pro­ceed­ings of the Na­tional Academy of Sciences, 109(24), pp.9331-9336.

Den­rell, J., Fang, C. and Liu, C., 2014. Per­spec­tive—Chance ex­pla­na­tions in the man­age­ment sci­ences. Or­ga­ni­za­tion Science, 26(3), pp.923-940.

Elster, J., 1989. Solomonic judge­ments: Stud­ies in the limi­ta­tion of ra­tio­nal­ity. Cam­bridge Univer­sity Press.

Fortin, J.M. and Cur­rie, D.J., 2013. Big sci­ence vs. lit­tle sci­ence: how sci­en­tific im­pact scales with fund­ing. PloS one, 8(6), p.e65263.

Frank, R.H., 2016. Suc­cess and luck: Good for­tune and the myth of mer­i­toc­racy. Prince­ton Univer­sity Press.

Grim, P., 2009, Novem­ber. Thresh­old Phenom­ena in Epistemic Net­works. In AAAI Fall Sym­po­sium: Com­plex Adap­tive Sys­tems and the Thresh­old Effect (pp. 53-60).

Har­nagel, A., 2018. A Mid-Level Ap­proach to Model­ing Scien­tific Com­mu­ni­ties. Stud­ies in His­tory and Philos­o­phy of Science (forth­com­ing). [Preprint]

Hofs­tee, W.K., 1990. Allo­ca­tion by lot: a con­cep­tual and em­piri­cal anal­y­sis. In­for­ma­tion (In­ter­na­tional So­cial Science Coun­cil), 29(4), pp.745-763.

Liu, C. and De Rond, M., 2016. Good night, and good luck: per­spec­tives on luck in man­age­ment schol­ar­ship. The Academy of Man­age­ment An­nals, 10(1), pp.409-451.

Martin, T., Hof­man, J.M., Sharma, A., An­der­son, A. and Watts, D.J., 2016, April. Ex­plor­ing limits to pre­dic­tion in com­plex so­cial sys­tems. In Pro­ceed­ings of the 25th In­ter­na­tional Con­fer­ence on World Wide Web (pp. 683-694). In­ter­na­tional World Wide Web Con­fer­ences Steer­ing Com­mit­tee. [arXiv preprint]

Mauboussin, M.J., 2012. The suc­cess equa­tion: Un­tan­gling skill and luck in busi­ness, sports, and in­vest­ing. Har­vard Busi­ness Press.

Mer­ton, R.K., 1968. The Matthew effect in sci­ence: The re­ward and com­mu­ni­ca­tion sys­tems of sci­ence are con­sid­ered. Science, 159(3810), pp.56-63.

Mon­geon, P., Brodeur, C., Beaudry, C. and Lariv­ière, V., 2016. Con­cen­tra­tion of re­search fund­ing leads to de­creas­ing marginal re­turns. Re­search Eval­u­a­tion, 25(4), pp.396-404. [arXiv preprint]

Neu­rath, O.I., 1913. Die Ver­ir­rten des Carte­sius und das Aux­iliar­mo­tiv.(Zur Psy­cholo­gie des Entschlusses.) Vor­trag. Barth.

Phe­lan, S.E. and Lin, Z., 2001. Pro­mo­tion sys­tems and or­ga­ni­za­tional perfor­mance: A con­tin­gency model. Com­pu­ta­tional & Math­e­mat­i­cal Or­ga­ni­za­tion The­ory, 7(3), pp.207-232.

Pluch­ino, A., Rapisarda, A. and Garo­falo, C., 2010. The Peter prin­ci­ple re­vis­ited: A com­pu­ta­tional study. Phys­ica A: Statis­ti­cal Me­chan­ics and its Ap­pli­ca­tions, 389(3), pp.467-472. [arXiv preprint]

Pluch­ino, A., Rapisarda, A. and Garo­falo, C., 2011a. Effi­cient pro­mo­tion strate­gies in hi­er­ar­chi­cal or­ga­ni­za­tions. Phys­ica A: Statis­ti­cal Me­chan­ics and its Ap­pli­ca­tions, 390(20), pp.3496-3511.

Pluch­ino, A., Garo­falo, C., Rapisarda, A., Spagano, S. and Caserta, M., 2011b. Ac­ci­den­tal poli­ti­ci­ans: How ran­domly se­lected leg­is­la­tors can im­prove par­li­a­ment effi­ciency. Phys­ica A: Statis­ti­cal Me­chan­ics and Its Ap­pli­ca­tions, 390(21-22), pp.3944-3954.

Pluch­ino, A., Biondo, A.E. and Rapisarda, A., 2018. Ta­lent Ver­sus Luck: The Role Of Ran­dom­ness In Suc­cess And Failure. Ad­vances in Com­plex Sys­tems, 21(03n04), p.1850014. [arXiv preprint]

Si­mon­ton, D.K., 2004. Creativity in sci­ence: Chance, logic, ge­nius, and zeit­geist. Cam­bridge Univer­sity Press.

Si­na­tra, R., Wang, D., Deville, P., Song, C. and Barabási, A.L., 2016. Quan­tify­ing the evolu­tion of in­di­vi­d­ual sci­en­tific im­pact. Science, 354(6312), p.aaf5239. [un­gated PDF]

Smith, J.E. and Win­kler, R.L., 2006. The op­ti­mizer’s curse: Skep­ti­cism and post­de­ci­sion sur­prise in de­ci­sion anal­y­sis. Man­age­ment Science, 52(3), pp.311-322.

Sobkow­icz, P., 2010. Dilbert-Peter model of or­ga­ni­za­tion effec­tive­ness: com­puter simu­la­tions. arXiv preprint arXiv:1001.4235.

Thorn­gate, W. and Car­roll, B., 1987. Why the best per­son rarely wins: Some em­bar­rass­ing facts about con­tests. Si­mu­la­tion & Games, 18(3), pp.299-320.

Thorn­gate, W., 1988. On the evolu­tion of ad­ju­di­cated con­tests and the prin­ci­ple of in­vidious. Jour­nal of Be­hav­ioral De­ci­sion Mak­ing, 1(1), pp.5-15.

Wahls, W.P., 2018. The NIH must re­duce dis­par­i­ties in fund­ing to max­i­mize its re­turn on in­vest­ments from tax­pay­ers. eLife, 23(7)

Weis­berg, M. and Mul­doon, R., 2009. Epistemic land­scapes and the di­vi­sion of cog­ni­tive la­bor. Philos­o­phy of sci­ence, 76(2), pp.225-252.


[1] In par­tic­u­lar, I didn’t think about game-the­o­retic rea­sons for mak­ing ran­dom de­ci­sions. For ex­am­ple, it is well known that in some games such as Match­ing pen­nies the only Nash equil­ibria are in mixed strate­gies.

[2] I don’t count strate­gies as ran­dom just be­cause they use noisy es­ti­ma­tors. For ex­am­ple, sup­pose you make a hiring de­ci­sion based on perfor­mance in a work test. You might find that there is some vari­a­tion in the perfor­mance in the test, even when taken by the same per­son. This might ap­pear as ran­dom vari­a­tion, and I might model perfor­mance as a ran­dom vari­able. Cf. Avin (2018, p. 11): “The po­ten­tial er­ror in the test in fact serves as a kind of lot­tery, which op­er­ates on top of the main func­tion of the test, which is to pre­dict perfor­mance.” How­ever, I don’t re­fer to de­ci­sions us­ing noisy es­ti­ma­tors as lot­ter­ies or ran­dom, un­less they in ad­di­tion use de­liber­ate ran­dom­iza­tion.

[3] I didn’t con­sult the pri­mary source.

[4] In the only ex­am­ple I looked at, listed on the web­site as “EU de­cides where Agen­cies lo­cated to re­place Lon­don(181)”, my im­pres­sion from the news story quotes Boyle pro­vides is that a lot­tery was merely used to break a vot­ing tie.

[5] One rea­son why I didn’t do so was merely that I couldn’t eas­ily get ac­cess to Mauboussin (2012) and Si­mon­ton (2004), while I be­came aware of Elster (1989) only shortly be­fore the end of this pro­ject.

[6] Mak­ing this state­ment more pre­cise and defend­ing it is be­yond the scope of this post. Den­rell and Liu (2012) show that choos­ing the win­ning op­tion can have sub­op­ti­mal ex-ante ex­pected value when the re­li­a­bil­ity of mea­sure­ments varies too much; see pp. 1f. of their “Sup­port­ing In­for­ma­tion” for a highly rele­vant dis­cus­sion of a “mono­tone like­li­hood ra­tio prop­erty”, and in par­tic­u­lar their con­clu­sion that “our re­sult can­not hap­pen when the noise term is nor­mally dis­tributed [...] and our re­sult could hap­pen for some fat-tailed noise dis­tri­bu­tions”. EDIT: The num­ber of available op­tion is also rele­vant, see this com­ment by Flodorner.

[7] Note that Boyce finds that many lot­ter­ies in prac­tice use dis­crim­i­na­tory par­ti­ci­pa­tion fees or other at first glance un­fair mechanisms. He there­fore re­jects fair­ness as an ex­pla­na­tion for the use of lot­ter­ies, and in­stead ap­peals to the self-in­ter­est of the lot­tery’s pri­mary user group.

[8] Cf. also Pluch­ino et al. (2011b, p. 3953): “On the con­trary the pro­cess of elec­tions by vote can be sub­ject to ma­nipu­la­tion by money and other pow­er­ful means.“

[9] “We might think that phys­i­cal abil­ity, which is an eas­ily mea­sured fac­tor, is the only rele­vant crite­rion in the se­lec­tion for mil­i­tary ser­vice and yet use a lot­tery to re­duce the in­cen­tive for self-mu­tila­tion.” (Elster 1989, p. 110). And: “[R]an­dom­iz­ing pre­vents re­cip­i­ents of scarce re­sources from try­ing to make them­selves more el­i­gible, at cost to them­selves or so­ciety.” (Elster, 1989, p. 111)

[10] How­ever, note that the in­cen­tive effects of lot­ter­ies need not be de­sir­able. For ex­am­ple, pro­mot­ing staff by lot­tery may on one hand de­crease waste­ful self-pro­mo­tion, but on the other hand re­move in­cen­tives to do a good job. The first of these effects has been ex­plored by Sobkow­icz (2010, sc. 2.8ff.) in a model oth­er­wise similar to Pluch­ino and col­leagues’ (2010). The sec­ond effect is im­plicit in Pluch­ino and col­leagues’ (2011a, p. 3509) recom­men­da­tion “to dis­t­in­guish pro­mo­tions from re­wards and in­cen­tives for the good work done”.

[11] More pre­cisely, their Figures 10 to 12 use an ap­pro­pri­ately nor­mal­ized ver­sion of the per­centage men­tioned in the main text. See their defi­ni­tion of E_{norm} for de­tails (Pluch­ino et al., 2018, pp. 1850014-20f..).

[12] While I’m rel­a­tively con­fi­dent in my anal­y­sis of Pluch­ino et al. (2010, 2011a), I haven’t done my own simu­la­tions or proofs to con­clu­sively con­firm it.

[13] The as­sump­tion de­scribed in the main text is referred to as “Peter hy­poth­e­sis” by Pluch­ino et al. (2010, p. 468). They also in­ves­ti­gate an al­ter­na­tive “com­mon sense hy­poth­e­sis”, un­der which ran­dom pro­mo­tions no longer out­perform the strat­egy of pro­mot­ing the best el­i­gible em­ploy­ees (ibid., p. 469, Fig. 2).

[14] As an aside, the work of Pluch­ino et al. (2010, 2011a) on pro­mo­tions in hi­er­ar­chi­cal or­ga­ni­za­tions also gen­er­ated sig­nifi­cant me­dia at­ten­tion. Pluch­ino et al. (2011a) point out that their 2010 pa­per “was quoted by sev­eral blogs and spe­cial­ized news­pa­pers, among which the MIT blog, the New York Times and the Fi­nan­cial Times, and it was also awarded the IG No­bel prize 2010 for ‘Man­age­ment’”. All ar­ti­cles I’ve checked were brief and un­crit­i­cal of Pluch­ino and col­leagues’ re­sults.

[15] For ex­am­ple, Pluch­ino et al. (2011b) con­sider the lo­ca­tion of leg­is­la­tors in a 2-di­men­sional pa­ram­e­ter space from Carlo M. Cipolla’s work on stu­pidity. ‘Epistemic land­scape’ mod­els of sci­en­tific ac­tivity as used by Avin (2015, 2017) and some pre­vi­ous au­thors (e.g. Grim, 2009; Weis­berg and Mul­doon, 2009) have been in­spired by fit­ness mod­els from evolu­tion­ary biol­ogy (Avin, 2015, pp. 78-86, sc. 3.3).

[16] For ex­am­ple, Pluch­ino et al. (2018) re­fer to the heavy-tailed dis­tri­bu­tion of wealth de­spite a less skewed dis­tri­bu­tion of in­puts such as in­tel­li­gence or work hours. Their model re­pro­duces a similar effect. How­ever, they provide no cor­re­spon­dence be­tween any of the vari­ables in their model and a par­tic­u­lar mea­surable real-world quan­tity. They there­fore can­not provide any quan­ti­ta­tive val­i­da­tion, such as for ex­am­ple check­ing whether their model re­pro­duces the cor­rect amount of heavy-tailed­ness rel­a­tive to a given set of in­puts. Similarly, Pluch­ino et al. (2011a, p. 3510) re­fer to a man­age­ment strat­egy em­pha­siz­ing task ro­ta­tion at Brazilian com­pany SEMCO, but provide no de­tails on how that strat­egy is similar to their ran­dom pro­mo­tions. Nor do they ex­plain on what ba­sis they say that task ro­ta­tion at SEMCO was “ap­plied suc­cess­fully”, other than by call­ing SEMCO’s CEO a “guru”, point­ing out he gives lec­tures at the Har­vard Busi­ness School, and say­ing that SEMCO grew from 90 to 3000 em­ploy­ees, a met­ric very differ­ent from the one used in their model.

[17] One limi­ta­tion is that I con­sulted some, but not all, refer­ences in Avin (2017) to check whether epistemic land­scape mod­els have been em­piri­cally val­i­dated in pre­vi­ous pub­li­ca­tions.

[18] I am in­debted to Sha­har Avin for point­ing me to this refer­ence.

[19] Un­for­tu­nately, their recom­men­da­tion for what to do when it isn’t known which of their two as­sump­tion ap­plies also is un­con­vinc­ing. This is be­cause they don’t con­sider differ­ent de­grees of be­lief in those as­sump­tions, and in­stead pre­sent a recom­men­da­tion that is only cor­rect if one’s cre­dence is di­vided about 50-50. What one should ac­tu­ally do, given their re­sults, is to ei­ther always pro­mote the best or the worst em­ploy­ees, de­pend­ing on whether one has less or more than 50% cre­dence in their Peter hy­poth­e­sis.

[20] Most of their re­ported re­sults are based on the pub­li­ca­tions of 2,887 physi­cists in the Phys­i­cal Re­view jour­nal fam­ily (Si­na­tra et al., 2016, p. aaf5239-7). This sam­ple was se­lected from a much larger dataset of 236,884 physi­cists ac­cord­ing to crite­ria such as min­i­mum ca­reer length and pub­li­ca­tion fre­quency. Their cho­sen sam­ple may raise two wor­ries. First, pat­terns of cita­tions within one fam­ily of physics jour­nals might not gen­er­al­ize to the pat­terns of all cita­tions in physics, let alone to other dis­ci­plines. Se­cond, their re­sults may not be ro­bust to chang­ing the crite­ria for se­lect­ing the sam­ple from the larger data set; e.g., what if we se­lect all physi­cists with ca­reers span­ning at least 10 rather than 20 years? (How­ever, note that most physi­cists in the full data set have short ca­reers with few pub­li­ca­tions, and that there­fore their sam­ple ex­cludes a larger frac­tion of re­searchers than of pub­li­ca­tions.) Si­na­tra et al. ad­dress these and other wor­ries by repli­cat­ing their main re­sults for differ­ent sam­pling crite­ria, and for a differ­ent data set cov­er­ing more sci­en­tific dis­ci­plines; see in par­tic­u­lar their Sup­ple­men­tary Ma­te­ri­als. I haven’t in­ves­ti­gated whether their at­tempts to re­but these wor­ries are con­vinc­ing.

[21] More pre­cisely, they as­sume that sci­en­tist i pub­lishes a to­tal num­ber of N_i pa­pers, with each pa­per’s im­pact be­ing de­ter­mined by in­de­pen­dent draws from Q_i * p, with ran­dom vari­a­tion in p be­ing the same for all sci­en­tists. They then use max­i­mum-like­li­hood es­ti­ma­tion to fit a trivari­ate log-nor­mal dis­tri­bu­tion in N_i, Q_i, p to their data. This re­sults in a model where Q_i is in­de­pen­dent of p (‘luck’) and only weakly cor­re­lated with N_i (‘pro­duc­tivity’). See Si­na­tra et al. (2016, pp. aaf5239-3f., sc. “Q-model”) for de­tails.

[22] How­ever, note that Si­na­tra et al. (2016) always look at the num­ber of cita­tions 10 years af­ter a pub­li­ca­tion. “[E]arly stages of a sci­en­tific ca­reer” here must there­fore mean at least ten years af­ter the first few pub­li­ca­tions.

[23] We must be care­ful no to pre­ma­turely iden­tify the pa­ram­e­ter Q with any spe­cific con­cep­tion of abil­ity. Based on Si­na­tra and col­leagues’ re­sults, Q could be any­thing that varies be­tween sci­en­tists but is con­stant within each ca­reer. It is per­haps more plau­si­ble that Q de­pends on, say, IQ than on height, but their anal­y­sis pro­vides no spe­cific rea­son to think that it does. Cf. also ibid., pp. aaf5239-6f.

[24] There is one re­sult in Si­na­tra et al. (2016) which might at first glance – mis­tak­enly – be taken to ex­plain diminish­ing marginal re­turns of re­search fund­ing, but it is not re­lated to ran­dom­ness. This is their find­ing that a sci­en­tist is more likely to pub­lish their high­est-im­pact pa­per within 20 years af­ter their first pub­li­ca­tion (ibid., p. aaf5239-3, Fig. 2D). One might sus­pect this is be­cause pro­duc­tivity or abil­ity de­crease over time, and worry that fund­ing based on past suc­cesses would there­fore se­lect for sci­en­tists with an already diminished po­ten­tial for im­pact. How­ever, Si­na­tra et al. in fact find that pro­duc­tivity in­creases within a ca­reer, and that abil­ity is con­stant. Their find­ing about the high­est-im­pact work’s timing is merely an effect of few ca­reers be­ing long, with the lo­ca­tion of a cut-off point af­ter 20 years be­ing an arte­fact of their sam­pling crite­ria (see Sup­ple­men­tary Ma­te­ri­als, p. 42, Fig. S12).

[25] The liter­a­ture dis­agrees on whether there ever are situ­a­tions where all available op­tions are ex­actly of equal value. Elster (1989, p. 54) pro­vides the ex­am­ple of “the choice be­tween iden­ti­cal cans of Camp­bell’s tomato soup”. Hofs­tee (1990, p. 746) ob­jects that “[i]n prac­tice, one would take the clos­est one and in­spect its ul­ti­mate con­sump­tion date” and claims that “[f]or prac­ti­cal pur­poses, how­ever, strict equiop­ti­mal­ity is non-ex­is­tent.”

[26] I am in­debted to Avin (per­sonal com­mu­ni­ca­tion) for clearly ex­press­ing these points.

[27] Open Phil’s ‘Se­cond Chance’ Pro­gram here just serves as an illus­tra­tive ex­am­ple of how a non­ran­dom strat­egy can have one of the same ad­van­tages of lot­ter­ies. I’m oth­er­wise not fa­mil­iar with the pro­gram, and in par­tic­u­lar don’t claim that con­sid­er­a­tion of lot­ter­ies in­formed the de­sign of this pro­gram or that avoid­ing cost for ap­pli­cants was an im­por­tant con­sid­er­a­tion.