But I think these explanations would predict âpublic, attributed estimates will tend to be lower than estimates from anonymised surveys (e.g., this one) and/âor nonpublic estimatesâ. But thatâs actually not the data weâre observing. There were 3 previous anonymised surveys (from the 2008 GCR conference, Grace et al., and that 2020 survey you mention), and each had notably lower mean/âmedian estimates for somewhat similar questions than this survey does.[1]
Maybe the theory could be âwell, that part was just random noiseâitâs just 4 surveys, so itâs not that surprising for this one to just happen to give the highest estimateâand then the rest is because people are inclined against giving high estimates when itâll be public and attributed to themâ.
But that has a slightepicycle/âpost-hoc-reasoning flavour. Especially because, similar to the points I raised above:
each survey had a decent number of participants (so youâd think the mean/âmedians would be close to the means/âmedians the relevant populations as a whole would give)
the surveys were sampling from somewhat similar populations (most clearly for the 2020 survey and this one, and less so for the 2008 oneâdue to a big time gapâand the Grace et al. one)
So this still seems pretty confusing to me.
Iâm inclined to think the best explanation would be that thereâs something distinctive about this survey that meant either people with high estimates were overrepresented or people were more inclined than theyâd usually be to give high estimates.[2] But Iâm not sure what that something would be, aside from Robâs suggestions that ârespondents who were following the forum discussion might have been anchored in some way by that discussion, or might have had a social desirability effect from knowing that the survey-writer puts high probability on AI risk. It might also have made a difference that I work at MIRI.â But I wouldnât have predicted in advance that those things would have as big an effect as seems to be happening here.
I guess it could just be a combination of three sets of small effects (noise, publicity/âattribution selecting for lower estimates, and people being influenced by knowing this survey was from Rob).
[1] One notable difference is that the GCR conference attendees were just estimating human extinction by 2100 as a result of âsuperintelligent AIâ. Maybe they thought that only accounted for less than 25% of total x-risk from AI (because there could also be later, non-extinction, or non-superintelligence x-risks from AI). But that seems unlikely to me, based on my rough impression of what GCR researchers around 2008 tended to focus on.
[2] I donât think the reason Iâm inclined to think this is trying to defend my previous prediction about the survey results or wanting a more optimistic picture of the future. Thatâs of course possible, but seems unlikely, and there are similarly plausibly biases that could push me in the opposite direction (e.g., Iâve done some AI-related work in the past and will likely do more in future, so higher estimates make my work seem more important.)
the surveys were sampling from somewhat similar populations (most clearly for the FHI research scholarâs survey and this one, and less so for the 2008 oneâdue to a big time gapâand the Grace et al. one)
I mostly just consider the FHI research scholar survey to be relevant counter evidence here because 2008 is indeed really far away and because I think EA researchers reason quite differently than the domain experts in the Grace et al survey.
When I posted my above comment I realized that I hadnât seen the results of the FHI survey! Iâd have to look it up to say more, but one hypothesis I already have could be: The FHI research scholars survey was sent to a broader audience than the one by Rob now (e.g., it was sent to me,and some of my former colleagues), and people with lower levels of expertise tend to defer more to what they consider to be the expert consensus , which might itself be affected by the possibility of public-facing biases.
Of course, Iâm also just trying to defend my initial intuition here. :)
Edit: Actually I canât find the results of that FHI RS survey. I only find this announcement. Iâd be curious if anyone knows more about the results of that survey â when I filled it out I thought it was well designed and I felt quite curious about peopleâs answers!
I helped run the other survey mentioned , so Iâll jump in here with the relevant results and my explanation for the difference. The full results will be coming out this week.
Results
We asked participants to estimate the probability of an existential catastrophe due to AI (see definitions below). We got:
mean: 0.23
median: 0.1
Our question isnât directly comparable with Robâs, because we donât condition on the catastrophe being âas a result of humanity not doing enough technical AI safety researchâ or âas a result of AI systems not doing/âoptimizing what the people deploying them wanted/âintendedâ. However, that means that our results should be even higher than Robâs.
Also, we operationalise existential catastophe/ârisk differently, though I think the operationalisations are similar to the point that they wouldnât effect my estimate. Nonetheless:
itâs possible that some respondents mistook âexistential catastropheâ for âextinctionâ in our survey, despite our clarifications (survey respondents often donât read the clarifications!)
while âthe overall value of the future will be drastically less than it could have beenâ and âexistential catastropheâ are intended to be basically the same, the former intuitively âsoundsâ more likely than the latter, which might have affected some responses.
My explanation
I think itâs probably a combination of things, including this difference in operationalisation, random noise, and Robâs suggestion that ârespondents who were following the forum discussion might have been anchored in some way by that discussion, or might have had a social desirability effect from knowing that the survey-writer puts high probability on AI risk. It might also have made a difference that I work at MIRI.â
I can add a bit more detail to how it might have made a difference that Rob works at MIRI:
In Robâs survey, 5â27 of respondents who specified an affiliation said they work at MIRI (~19%)
In our survey, 1â43 respondents who specified an affiliation said they work at MIRI (~2%)
(Robâs survey had 44 respondents in total, ours had 75)
Definitions from our survey
Define an existential catastrophe as the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development (Bostrom, 2013).
Define an existential catastrophe due to AI as an existential catastrophe that could have been avoided had humanityâs development, deployment or governance of AI been otherwise. This includes cases where:
AI directly causes the catastrophe.
AI is a significant risk factor in the catastrophe, such that no catastrophe would have occurred without the involvement of AI.
Humanity survives but its suboptimal use of AI means that we fall permanently and drastically short of our full potential.
Other results from our survey
We also asked participants to estimate the probability of an existential catastrophe due to AI under two other conditions.
Within the next 50 years
mean: 0.12
median: 0.05
In a counterfactual world where AI safety and governance receive no further investment or work from people aligned with the ideas of âlongtermismâ, âeffective altruismâ or ârationalityâ (but there are no other important changes between this counterfactual world and our world, e.g. changes in our beliefs about the importance and tractability of AI risk issues).
Excited to have the full results of your survey released soon! :) I read a few paragraphs of it when you sent me a copy, though I havenât read the full paper.
Your âprobability of an existential catastrophe due to AIâ got mean 0.23 and median 0.1. Notably, this includes misuse risk along with accident risk, so itâs especially striking that itâs lower than my surveyâs Q2, â[risk from] AI systems not doing/âoptimizing what the people deploying them wanted/âintendedâ, which got mean ~0.401 and median 0.3.
Looking at different subgroupsâ answers to Q2:
MIRI: mean 0.8, median 0.7.
OpenAI: mean ~0.207, median 0.26. (A group that wasnât in your survey.)
No affiliation specified: mean ~0.446, median 0.35. (Might or might not include MIRI people.)
All respondents other than âMIRIâ and âno affiliation specifiedâ: mean 0.278, median 0.26.
Even the latter group is surprisingly high. A priori, Iâd have expected that MIRI on its own would matter less than âthe overall (non-MIRI) target populations are very different for the two surveysâ:
My survey was sent to FHI, MIRI, DeepMind, CHAI, Open Phil, OpenAI, and ârecent OpenAIâ.
Your survey was sent to four of those groups (FHI, MIRI, CHAI, Open Phil), subtracting OpenAI, ârecent OpenAIâ, and DeepMind. Yours was also sent to CSER, Mila, Partnership on AI, CSET, CLR, FLI, AI Impacts, GCRI, and various independent researchers recommended by these groups. So your survey has fewer AI researchers, more small groups, and more groups that donât have AGI/âTAI as their top focus.
You attempted to restrict your survey to people âwho have taken time to form their own views about existential risk from AIâ, whereas I attempted to restrict to anyone âwho researches long-term AI topics, or who has done a lot of past work on such topicsâ. So Iâd naively expect my population to include more people who (e.g.) work on AI alignment but havenât thought a bunch about risk forecasting; and Iâd naively expect your population to include more people who have spent a day carefully crafting an AI x-risk prediction, but primarily work in biosecurity or some other area. Thatâs just a guess on my part, though.
Overall, your methods for choosing who to include seem super reasonable to me -- perhaps more natural than mine, even. Part of why I ran my survey was just the suspicion that thereâs a lot of disagreement between orgs and between different types of AI safety researcher, such that it makes a large difference which groups we include. Iâd be interested in an analysis of that question; eyeballing my chart, it looks to me like there is a fair amount of disagreement like that (even if we ignore MIRI).
Oh, your survey also frames the questions very differently, in a way that seems important to me. You give multiple-choice questions like :
Which of these is closest to your estimate of the probability that there will be an existential catastrophe due to AI (at any point in time)?
0.0001%
0.001%
0.01%
0.1%
0.5%
1%
2%
3%
4%
5%
6%
7%
8%
9%
10%
15%
20%
25%
30%
35%
40%
45%
50%
55%
60%
65%
70%
75%
80%
85%
90%
95%
100%
⊠whereas I just asked for a probability.
Overall, you give fourteen options for probabilities below 10%, and two options above 90%. (One of which is the dreaded-by-rationalists â100%â.)
By giving many fine gradations of âAI x-risk is low probabilityâ without giving as many gradations of âAI x-risk is high probabilityâ, youâre communicating that low-probability answers are more normal/ânatural/âexpected.
The low probabilities are also listed first, which is a natural choice but could still have a priming effect. (Anchoring to 0.0001% and adjusting from that point, versus anchoring to 95%.) On my screenâs resolution, you have to scroll down three pages to even see numbers as high as 65% or 80%. I lean toward thinking âlow probabilities listed firstâ wasnât a big factor, though.
My surveyâs also a lot shorter than yours, so I could imagine it filtering for respondents who are busier, lazier, less interested in the topic, less interested in helping produce good survey data, etc.
Thanks, this and Robâs comment are interesting.
But I think these explanations would predict âpublic, attributed estimates will tend to be lower than estimates from anonymised surveys (e.g., this one) and/âor nonpublic estimatesâ. But thatâs actually not the data weâre observing. There were 3 previous anonymised surveys (from the 2008 GCR conference, Grace et al., and that 2020 survey you mention), and each had notably lower mean/âmedian estimates for somewhat similar questions than this survey does.[1]
Maybe the theory could be âwell, that part was just random noiseâitâs just 4 surveys, so itâs not that surprising for this one to just happen to give the highest estimateâand then the rest is because people are inclined against giving high estimates when itâll be public and attributed to themâ.
But that has a slight epicycle/âpost-hoc-reasoning flavour. Especially because, similar to the points I raised above:
each survey had a decent number of participants (so youâd think the mean/âmedians would be close to the means/âmedians the relevant populations as a whole would give)
the surveys were sampling from somewhat similar populations (most clearly for the 2020 survey and this one, and less so for the 2008 oneâdue to a big time gapâand the Grace et al. one)
So this still seems pretty confusing to me.
Iâm inclined to think the best explanation would be that thereâs something distinctive about this survey that meant either people with high estimates were overrepresented or people were more inclined than theyâd usually be to give high estimates.[2] But Iâm not sure what that something would be, aside from Robâs suggestions that ârespondents who were following the forum discussion might have been anchored in some way by that discussion, or might have had a social desirability effect from knowing that the survey-writer puts high probability on AI risk. It might also have made a difference that I work at MIRI.â But I wouldnât have predicted in advance that those things would have as big an effect as seems to be happening here.
I guess it could just be a combination of three sets of small effects (noise, publicity/âattribution selecting for lower estimates, and people being influenced by knowing this survey was from Rob).
[1] One notable difference is that the GCR conference attendees were just estimating human extinction by 2100 as a result of âsuperintelligent AIâ. Maybe they thought that only accounted for less than 25% of total x-risk from AI (because there could also be later, non-extinction, or non-superintelligence x-risks from AI). But that seems unlikely to me, based on my rough impression of what GCR researchers around 2008 tended to focus on.
[2] I donât think the reason Iâm inclined to think this is trying to defend my previous prediction about the survey results or wanting a more optimistic picture of the future. Thatâs of course possible, but seems unlikely, and there are similarly plausibly biases that could push me in the opposite direction (e.g., Iâve done some AI-related work in the past and will likely do more in future, so higher estimates make my work seem more important.)
I mostly just consider the FHI research scholar survey to be relevant counter evidence here because 2008 is indeed really far away and because I think EA researchers reason quite differently than the domain experts in the Grace et al survey.
When I posted my above comment I realized that I hadnât seen the results of the FHI survey! Iâd have to look it up to say more, but one hypothesis I already have could be: The FHI research scholars survey was sent to a broader audience than the one by Rob now (e.g., it was sent to me, and some of my former colleagues), and people with lower levels of expertise tend to defer more to what they consider to be the expert consensus , which might itself be affected by the possibility of public-facing biases.
Of course, Iâm also just trying to defend my initial intuition here. :)
Edit: Actually I canât find the results of that FHI RS survey. I only find this announcement. Iâd be curious if anyone knows more about the results of that survey â when I filled it out I thought it was well designed and I felt quite curious about peopleâs answers!
I helped run the other survey mentioned , so Iâll jump in here with the relevant results and my explanation for the difference. The full results will be coming out this week.
Results
We asked participants to estimate the probability of an existential catastrophe due to AI (see definitions below). We got:
mean: 0.23
median: 0.1
Our question isnât directly comparable with Robâs, because we donât condition on the catastrophe being âas a result of humanity not doing enough technical AI safety researchâ or âas a result of AI systems not doing/âoptimizing what the people deploying them wanted/âintendedâ. However, that means that our results should be even higher than Robâs.
Also, we operationalise existential catastophe/ârisk differently, though I think the operationalisations are similar to the point that they wouldnât effect my estimate. Nonetheless:
itâs possible that some respondents mistook âexistential catastropheâ for âextinctionâ in our survey, despite our clarifications (survey respondents often donât read the clarifications!)
while âthe overall value of the future will be drastically less than it could have beenâ and âexistential catastropheâ are intended to be basically the same, the former intuitively âsoundsâ more likely than the latter, which might have affected some responses.
My explanation
I think itâs probably a combination of things, including this difference in operationalisation, random noise, and Robâs suggestion that ârespondents who were following the forum discussion might have been anchored in some way by that discussion, or might have had a social desirability effect from knowing that the survey-writer puts high probability on AI risk. It might also have made a difference that I work at MIRI.â
I can add a bit more detail to how it might have made a difference that Rob works at MIRI:
In Robâs survey, 5â27 of respondents who specified an affiliation said they work at MIRI (~19%)
In our survey, 1â43 respondents who specified an affiliation said they work at MIRI (~2%)
(Robâs survey had 44 respondents in total, ours had 75)
Definitions from our survey
Other results from our survey
We also asked participants to estimate the probability of an existential catastrophe due to AI under two other conditions.
Within the next 50 years
mean: 0.12
median: 0.05
In a counterfactual world where AI safety and governance receive no further investment or work from people aligned with the ideas of âlongtermismâ, âeffective altruismâ or ârationalityâ (but there are no other important changes between this counterfactual world and our world, e.g. changes in our beliefs about the importance and tractability of AI risk issues).
mean: 0.32
median: 0.25
Excited to have the full results of your survey released soon! :) I read a few paragraphs of it when you sent me a copy, though I havenât read the full paper.
Your âprobability of an existential catastrophe due to AIâ got mean 0.23 and median 0.1. Notably, this includes misuse risk along with accident risk, so itâs especially striking that itâs lower than my surveyâs Q2, â[risk from] AI systems not doing/âoptimizing what the people deploying them wanted/âintendedâ, which got mean ~0.401 and median 0.3.
Looking at different subgroupsâ answers to Q2:
MIRI: mean 0.8, median 0.7.
OpenAI: mean ~0.207, median 0.26. (A group that wasnât in your survey.)
No affiliation specified: mean ~0.446, median 0.35. (Might or might not include MIRI people.)
All respondents other than âMIRIâ and âno affiliation specifiedâ: mean 0.278, median 0.26.
Even the latter group is surprisingly high. A priori, Iâd have expected that MIRI on its own would matter less than âthe overall (non-MIRI) target populations are very different for the two surveysâ:
My survey was sent to FHI, MIRI, DeepMind, CHAI, Open Phil, OpenAI, and ârecent OpenAIâ.
Your survey was sent to four of those groups (FHI, MIRI, CHAI, Open Phil), subtracting OpenAI, ârecent OpenAIâ, and DeepMind. Yours was also sent to CSER, Mila, Partnership on AI, CSET, CLR, FLI, AI Impacts, GCRI, and various independent researchers recommended by these groups. So your survey has fewer AI researchers, more small groups, and more groups that donât have AGI/âTAI as their top focus.
You attempted to restrict your survey to people âwho have taken time to form their own views about existential risk from AIâ, whereas I attempted to restrict to anyone âwho researches long-term AI topics, or who has done a lot of past work on such topicsâ. So Iâd naively expect my population to include more people who (e.g.) work on AI alignment but havenât thought a bunch about risk forecasting; and Iâd naively expect your population to include more people who have spent a day carefully crafting an AI x-risk prediction, but primarily work in biosecurity or some other area. Thatâs just a guess on my part, though.
Overall, your methods for choosing who to include seem super reasonable to me -- perhaps more natural than mine, even. Part of why I ran my survey was just the suspicion that thereâs a lot of disagreement between orgs and between different types of AI safety researcher, such that it makes a large difference which groups we include. Iâd be interested in an analysis of that question; eyeballing my chart, it looks to me like there is a fair amount of disagreement like that (even if we ignore MIRI).
Oh, your survey also frames the questions very differently, in a way that seems important to me. You give multiple-choice questions like :
⊠whereas I just asked for a probability.
Overall, you give fourteen options for probabilities below 10%, and two options above 90%. (One of which is the dreaded-by-rationalists â100%â.)
By giving many fine gradations of âAI x-risk is low probabilityâ without giving as many gradations of âAI x-risk is high probabilityâ, youâre communicating that low-probability answers are more normal/ânatural/âexpected.
The low probabilities are also listed first, which is a natural choice but could still have a priming effect. (Anchoring to 0.0001% and adjusting from that point, versus anchoring to 95%.) On my screenâs resolution, you have to scroll down three pages to even see numbers as high as 65% or 80%. I lean toward thinking âlow probabilities listed firstâ wasnât a big factor, though.
My surveyâs also a lot shorter than yours, so I could imagine it filtering for respondents who are busier, lazier, less interested in the topic, less interested in helping produce good survey data, etc.