It’s quite easy to research the cost of creating a rice farm, or a power plant, as well as get a tight bounded probability distribution for the expected price you can sell your rice or electricity at after making the initial investment. These markets are very mature and there’s unlikely to be wild swings or unexpected innovations that significantly change the market.
This doesn’t affect your overall article much, but it’s worth noting that commodity prices can be very volatile. Looking up the generic rice contract on Bloomberg for example, and picking the more extreme years but the same month (to avoid seasonality):
1998 April: 10.2
2002 April: 3.6
2004 April: 11.3
2005 April 7.2
2008 April: 23.8
2010 April: 12.6
2013 April: 15.8
2015 April: 10.0
You do have the ability to lock in the current implied profitability using futures, but in general commodity markets seem to be more volatile than non-commodity markets.
I think one paper shows that there were almost 40 near misses, and I think that was put up by the Future of Life Institute, so some people can look up that paper, and I think that in general it seems that experts agree some of the biggest risks from nuclear would be accidental use, rather than deliberate and malicious use between countries.
Possibly you are thinking of the Global Catastrophic Risks Institute, and Baum et al.’s A Model for the Probability of Nuclear War ?
Thanks for highlighting this, I thought it was interesting. It does seem that, if you thought getting Vox to write about AI was good, it would be good to have an offsetting right-wing spokesman on the issue.
One related point would be that we can try to avoid excessively associating AI risk with left wing causes; discrimination is the obvious one. The alternative would be to try to come up with right-wing causes to associate it with as well; I have one idea, but I think this strategy may be a bad idea so am loath to share it.
This was very interesting. Retrospectives on projects that didn’t work can be extremely helpful to others, but I imagine can also been tough to write, so thanks very much!
It takes a long time to craft a response to posts like these. Even if there are clear problems with the post, given the sensitive topic you have to spend a lot of time on nuance, checking citations, and getting the tone right. That is a very high bar, one that I don’t think is reasonable to expect everyone to pass. In contrast, people who agree seem to get a pass for silently upvoting.
While I appreciate your saying you don’t intend to ban topics, I think there is considerable risk that this sort of policy becomes a form of de facto censorship. In the same way that we should be wary of Isolated Demands for Rigour, so too we should also be wary of Isolated Demands for Sensitivity.
Take for example the first item on your list—lets call it A).
Whether it is or has been right or necessary that women have less influence over intellectual debate and less economic and political power
I agree that this is not a great topic for an EA discussion. I haven’t seen any arguments about the cost-effectiveness of a cause area that rely on whether A) is true or false. It seems unlikely that specifically feminist or anti-feminist causes would be the best things to work on, even if you thought A) was very true or false. If such a topic was very distracting, I can even see it making sense to essentially ban discussion of it, as LessWrong used to do in practice with regard Politics.
My concern is that a rule/recommendation against discussing such a topic might in practice be applied very unequally. For example, I think that someone who says
As you know, women have long suffered from discrimination, resulting a lack of political power, and their contributions being overlooked. This is unjust, and the effects are still felt today.
would not be chastised for doing so, or feel that they had violated the rule/suggestion.
However, my guess is that someone who said
As you know, the degree of discrimination against women has been greatly exaggerated, and in many areas, like conscription or homicide risk, they actually enjoy major advantages over men.
might be criticized for doing so, and might even agree (if only privately) that they had in some sense violated this rule/guideline with regard topic A).
If this is the case, then this policy is de facto a silencing not of topics, but of opinions, which I think is much harder to justify.
As a list of verboten opinions, this list also has the undesirable attribute of being very partisan. Looking down the list, it seems that in almost every case the discouraged/forbidden opinion is, in contemporary US political parlance, the (more) Right Wing opinion, and the assumed ‘default’ ‘acceptable’ one is the (more) Left Wing opinion. In addition, my impression (though I am less sure here) is that it is also biased against opinions disproportionately held by older people.
And yet these are two groups that are dramatically under-represented in the EA movement! (source) Certainly it seems that, on a numerical basis, conservatives are more under-represented than some of the protected groups mentioned in this article. This sort of list seems likely to make older and more conservative people feel less welcome, not more. Various viewpoints they might object to have been enshrined, and other topics, whose discussion conservatives find disasteful but is nonetheless not uncommon in the EA community, are not contraindicated.
For a generally well-received article on how to partially address this, you might enjoy Ozy’s piece here.
Here is a recent study on the topic that I think is very relevant:
Gender, Race, and Entrepreneurship: A Randomized Field Experiment on Venture Capitalists and Angels (Gornall and Strebulaev)
We sent out 80,000 pitch emails introducing promising but fictitious start-ups to 28,000 venture capitalists and business angels. Each email was sent by a fictitious entrepreneur with a randomly selected gender (male or female) and race (Asian or White). Female entrepreneurs received an 8% higher rate of interested replies than male entrepreneurs pitching identical projects. Asian entrepreneurs received a 6% higher rate than White entrepreneurs. Our results are not consistent with discrimination against females or Asians at the initial contact stage of the investment process.
However, it does seem pretty applicable to EA. The EA community is in many ways similar to the VC community:
Similar geographies: the Bay Area, London, New York etc.
Similar education backgrounds.
Both involve evaluating speculative projects with a lot of uncertainty.
Similarly to the studies discussed above, this finds that people are biased against white men.
(I have some qualms about this type of study, because they involve wasting people’s time without their consent, but this doesn’t affect the conclusions.)
Great post. I’m sure writing this must have been tough, so thanks very much for sharing this.
Great post; I had been thinking about writing something very similar. In many ways I think you have actually understated the potential of the idea. Additionally I think it addresses some of the concerns Owen raised last time.
The final prize evaluations could be quite costly to produce.
I actually think the final evaluations might be cheaper than the status quo. At the moment OpenPhil (or whoever) has to do two things:
1) Judge how good an outcome is.
2) Judge how likely different outcomes are.
With this plan, 2) has been (partially) outsourced to the market, leaving them with just 1).
If Impact Prizes took off, I could imagine some actors drawing into the ecosystem who only motivated by making profits.
This is not a bug, this is a feature! There is a very large pool of people willing to predict arbitrary outcomes in return for money, that we have thus far only very indirectly been tapping into. In general bringing in more traders improves the efficiency of a market. Even if you add noisy traders, their presence improves the incentives for ‘smart money’ to participate. I think it’s unlikely we’d reach the scale required for actual hedge funds to get involved, but I do think it’s plausible we could get a lot of hedge fund guys participating in their spare time.
In terms of legal status, one option I’ve been thinking about would be copying PredictIt. If we have to pay taxes every time a certificate is transferred, the transaction costs will be prohibitive. I am quite worried it will be hard to make this work within US law unfortunately, which is not very friendly to this sort of experimentation. At the same time, given the SEC’s attitude towards non-compliant security issuance, I would not want to operate outside it!
Quick other thoughts
One issue with the idea is it is hard for OpenPhil to add more promised funding later, because the initial investment will already have been committed at some fixed level. e.g. If OpenPhil initially promise $10m, and then later bump it to $20m, projects that have already sold their tokens cannot expand to take advantage of this increase, so it is effectively pure windfall with no incentive effect. A possible solution would be cohorts; we promise $10m in 2022 for projects started in 2019, and then later add another $12m, paid in 2023, for 2020 projects.
I think I might have been the second largest purchaser of the certificates. My experience was that we didn’t attract the really high quality projects I’d want, and those we did see had very high reservation prices from the sellers, perhaps due to the endowment effect. I suspect sellers might say that they didn’t see enough buyers. Possibly we just had a chicken-and-egg problem, combined with everyone involved being kind of busy.
Could you go into a bit more detail about the two linguistic styles you described, perhaps using non-AI examples? My interpretation of them is basically agent-focused vs internal-mechanics-focused, but I’m not sure this is exactly what you mean.
If the above is correct, it seems like you’re basically saying that internal-mechanics-focused descriptions work better for currently existing AI systems, which seems true to me for things like self-driving cars. But for something like AlphaZero, or Stockfish, I think an agentic framing is often actually quite useful:
A chess/Go AI is easy to imagine: they are smart and autonomous and you can trust the bot like you trust a human player. They can make mistakes but probably have good intent. When they encounter an unfamiliar game situation they can think about the correct way to proceed. They behave in concordance with the goal (winning the game) their creator set them and they tend to make smart decisions. If anything goes wrong then the car is at fault.
So I think the reason this type of language doesn’t work well for self-driving cars is because they aren’t sufficiently agent-like. But we know there can be agentic agents—humans are an example—so it seems plausible to me that agentic language will be the best descriptor for them. Certainly it is currently the best descriptor for them, given that we do not understand the internal mechanics of as-yet-uninvented AIs.
In general it’s probably best not to anonymize applications. Field studies generally show no effect on interview selection, and sometimes even show a negative effect (which has also been seen in the lab). Blinding may work for musicians, randomly generated resumes, and identical expressions of interest, but in reality there seem to be subtle cues of an applicant’s background that evaluators may pick up on, and the risk of anonymization backfiring is higher for recruiting groups which are actively interested in DEI. This may be because they are unable to proactively check their biases when blind, or to proactively accommodate disadvantaged candidates at this recruitment stage, or because their staff is already more diverse and people may favor candidates they identify with demographically.
I think you are mis-describing these studies. Essentially, they found that when reviewers knew the race and sex of the applicants, they were biased in favour of women and non-whites, and against white males.
I admit I only read two of the studies you linked two, but I think these quotes from them are quite clear that about the conclusions:
We find that participating firms become less likely to interview and hire minority candidates when receiving anonymous resumes.
The public servants reviewing the job applicants engaged in discrimination that favoured female applicants and disadvantaged male candidates
Affirmative action towards the Indigenous female candidate is the largest, being 22.2% more likely to be short listed on average when identified compared to the de-identified condition. On the other hand, the identified Indigenous male CV is 9.4% more likely to be shortlisted on average compared to when it is de-identified. In absolute terms most minority candidates are on average more likely to be shortlisted when named compared to the de-identified condition, but the difference for the Indigenous female candidate is the only one that is statistically significant at the 95% confidence level.
This is also supported by other papers on the subject. For example, you might enjoy reading Williams and Ceci (2015):
The underrepresentation of women in academic science is typically attributed, both in scientific literature and in the media, to sexist hiring. Here we report five hiring experiments in which faculty evaluated hypothetical female and male applicants, using systematically varied profiles disguising identical scholarship, for assistant professorships in biology, engineering, economics, and psychology. Contrary to prevailing assumptions, men and women faculty members from all four fields preferred female applicants 2:1 over identically qualified males with matching lifestyles (single, married, divorced), with the exception of male economists, who showed no gender preference. Comparing different lifestyles revealed that women preferred divorced mothers to married fathers and that men preferred mothers who took parental leaves to mothers who did not. Our findings, supported by real-world academic hiring data, suggest advantages for women launching academic science careers.
This doesn’t mean that anonymizing applications is a bad idea—it appears to have successfully reduced unfair bias—rather that the bias was in the opposite direction than the authors expected to find it.
Thanks very much! I think this prize is a great idea. I was definitely motivated to invest more time and effort by the hope of winning the prize (along with the satisfaction of getting front page with a lot of karma).
I have definitely heard people referring to Future Perfect as ‘the EA part of Vox’ or similar.
You might enjoy this post Claire wrote: Ethical Offsetting is Antithetical to EA.
Thanks for writing this, I thought it was quite a good summary. However, I would like to push back on two things.
Effective altruism is egalitarian. Effective altruism values all people equally
I often think of age as being one dimension that egalitarians think should not influence how important someone is. However, despite GiveWell being one of the archetypal EA organisations (along with GWWC/CEA), they do not do this. Rather, they value middle-aged years of life more highly than baby years or life or old people years of life. See for example this page here. Perhaps EA should be egalitarian, but de facto it does not seem to be.
Effective altruism is secular. It does not recommend charities that most effectively get people into Heaven …
This item seem rather different from the other items on the list. Most of the others seem like rational positions for virtually anyone to hold. However, if you were religious, this tennant seems very irrational—helping people get into heaven would be the most effective thing you could do! Putting this here seems akin to saying that AMF is an EA value; rather, these are conclusions, not premises.
Additionally, there is some evidence that promoting religion might be beneficial even on strictly material grounds. Have you seen the recent pre-registered RCT on protestant evangelism?
To test the causal impact of religiosity, we conducted a randomized evaluation of an evangelical Protestant Christian values and theology education program that consisted of 15 weekly half-hour sessions. We analyze outcomes for 6,276 ultra-poor Filipino households six months after the program ended. We find significant increases in religiosity and income, no significant changes in total labor supply, assets, consumption, food security, or life satisfaction, and a significant decrease in perceived relative economic status. Exploratory analysis suggests the program may have improved hygienic practices and increased household discord, and that the income treatment effect may operate through increasing grit.
I don’t have a strong view on whether or not this is actually a good thing to do, let alone the best thing. RCTs provide high quality causal evidence, but even then most interventions do not work very well, and I’m not an expert on the impact of evangelism. But it seems strange to assume from very beginning that it is not something EAs would ever be interested in.
Congratulations guys, this is really impressive. Thanks for all the work you put into this.
My general model is that charities get funding in two waves:
2) The rest of the year
As such, if I ask groups for their runway at the beginning of 1), and they say they have 12 months, that basically means that even if they failed to raise any money at all in the following 1) and 2) they would still survive until next December, at which point they could be bailed out.
However, I now think this is rather unfair, as in some sense I’m playing donor-of-last-resort with other December donors. So yes, I think 18 months may be a more reasonable threshold.
No principled reason, other than that this is not really my field, and I ran out of time, especially for work produced outside donate-able organizations. Sorry!
It’s also worth noting that I believe the new managers do not have access to large pots of discretionary funding (easier to deploy than EA Funds) that they can use to fund opportunities that they find.