elifland comments on An experiment eliciting relative estimates for Open Philanthropy’s 2018 AI safety grants

elifland Sep 12, 2022, 12:16 PM
32 points
0 ∶ 0
(Comments re-worded from those on a draft)
Overall I like the direction this post pushes in.
I shared a briefing with the participants summarizing the nine Open Philanthropy grants above, with the idea that it might speed the process along.
In hindsight, this was suboptimal, and might have led to some anchoring bias. Some participants complained that the summaries had some subjective component. These participants said they used the source links but did not pay that much attention to these opinions.
On the other hand, other participants said they found the subjective estimates useful. And because the briefing was written in good faith, I am personally not particularly worried about it. Even if there are anchoring issues, we may not necessarily care about it if we think that the output is accurate, in the same way that we may not care about forecasters anchoring on the base rate. [emphasis mine]
If I were redoing this experiment, I would probably limit myself even more to expressing only factual claims and finding sources. A better scheme may have been share a writeup with a minimal subjective component, then strongly encouraging participants to make their own judgments before looking at a separate writeup with more subjective summaries, which they can optionally use to adjust their estimates
I disagree with the opinions expressed in the bolded paragraph. I wouldn’t want forecasters to anchor on a specific base rate I gave them! I’d want them to find their own. Of course you think that the forecasters are anchoring on something accurate since the opinions they’re anchoring on are your own! This isn’t reassuring to me at all.
Thoughts on scaling up this type of estimation up [section header]
I’m more excited about in-depth evaluation of agendas/organizations as a whole than trying to scale up shallow estimations to all grants.
Giving some very quick numbers to this, say:
- a 12% chance of AGI being built before 2030,
- a 30% of it being built in Britain by then if so,
- a 90% of it being built by DeepMind if so,
- an initial 50% chance of it going well if so
- GovAI efforts shift the probability of it going well from 50% to 55%.
Punching those numbers into a calculator, a rough estimate is that GovAI reduces existential risk by around 0.081%, or 8.1 basis points.
This BOTEC feels too optimistic about GovAI’s impact to me, and I trust it even less than most BOTECs because it’s not directly modeling the channel though which I (and I believe GovAI) think GovAI will have the most impact, which is field-building.
- NunoSempere Sep 12, 2022, 1:08 PM
  2 points
  0 ∶ 0
  Parent
  Thanks Eli. I think I most disagree with you on the BOTEC point. Copying a paragraph from the text:
  The key number here is the 5% improvement (from 50% to 55%). I’m getting this estimate mostly because I think that Allan Dafoe being the “Head of Long-term Strategy and Governance” at DeepMind seems like a promising signal. It nicely corresponds to the “having people in places to implement safety strategies” part of GovAI’s pathway to impact. But that estimation strategy is very crude, and I could imagine a better estimate ranging from <0.5% to more than 5%.
  So I think that the handwavy estimate is still meaningful.