Executive summary: This FAQ defends the ESPAI survey’s methodology as unusually strong for eliciting AI researchers’ views—large sample (n=2,778 in 2023), ~15% response rate, randomized question variants, bias-mitigation steps—and argues that widely cited results (e.g., median ≈5% chance of extinction or similarly severe disempowerment) are robust across framings, while candidly noting limitations (framing sensitivity, ambiguity around “extremely bad outcomes,” and residual non-response bias); it is an evidence-based clarification meant to preempt misinterpretation of upcoming 2024 results.
Key points:
Method strength and scale: 2023 ESPAI contacted ~20k authors across six top venues and received 2,778 responses (~15%), comparing favorably to similar surveys (e.g., O’Donovan et al. ~4%), with transparent sampling from publication venues rather than ad-hoc “expert” lists.
Bias mitigation and measurement: Invitations obscured topic, payments and reminders boosted participation, questions were cognitive-tested, non-respondents’ demographics were sampled, and multiple framings were randomized—so item non-response and dropout were minimal (~95% reached the end).
Apparent “skips” explained: Most questions went to randomized subsets (5–50%) to expand coverage; completion among those shown a question averaged 96%, so small denominators reflect design, not disengagement.
Extinction-related findings are robust: Nearly all respondents answered at least one of two extinction-adjacent items; across four variants the 2023 medians were ~5% (one at 10%), and the same ~5% median holds even among respondents who report “very little” or “a little” prior thought.
Non-response bias unlikely to drive risk estimates: Vague invites limit selection by x-risk interest; recognition effects appear small; AI-safety-focused researchers are a small fraction; demographic response differences are measured and do not explain away the headline medians.
Limitations and interpretation: Expert forecasts vary by framing and are not highly reliable; wording (“extinction” vs. “similarly severe disempowerment”) blends scenarios; residual non-response bias may remain—so treat numbers as coarse signals warranting further analysis, not precise forecasts; prior journal publication (2016 JAIR) indicates publishable rigor.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, andcontact us if you have feedback.
Executive summary: This FAQ defends the ESPAI survey’s methodology as unusually strong for eliciting AI researchers’ views—large sample (n=2,778 in 2023), ~15% response rate, randomized question variants, bias-mitigation steps—and argues that widely cited results (e.g., median ≈5% chance of extinction or similarly severe disempowerment) are robust across framings, while candidly noting limitations (framing sensitivity, ambiguity around “extremely bad outcomes,” and residual non-response bias); it is an evidence-based clarification meant to preempt misinterpretation of upcoming 2024 results.
Key points:
Method strength and scale: 2023 ESPAI contacted ~20k authors across six top venues and received 2,778 responses (~15%), comparing favorably to similar surveys (e.g., O’Donovan et al. ~4%), with transparent sampling from publication venues rather than ad-hoc “expert” lists.
Bias mitigation and measurement: Invitations obscured topic, payments and reminders boosted participation, questions were cognitive-tested, non-respondents’ demographics were sampled, and multiple framings were randomized—so item non-response and dropout were minimal (~95% reached the end).
Apparent “skips” explained: Most questions went to randomized subsets (5–50%) to expand coverage; completion among those shown a question averaged 96%, so small denominators reflect design, not disengagement.
Extinction-related findings are robust: Nearly all respondents answered at least one of two extinction-adjacent items; across four variants the 2023 medians were ~5% (one at 10%), and the same ~5% median holds even among respondents who report “very little” or “a little” prior thought.
Non-response bias unlikely to drive risk estimates: Vague invites limit selection by x-risk interest; recognition effects appear small; AI-safety-focused researchers are a small fraction; demographic response differences are measured and do not explain away the headline medians.
Limitations and interpretation: Expert forecasts vary by framing and are not highly reliable; wording (“extinction” vs. “similarly severe disempowerment”) blends scenarios; residual non-response bias may remain—so treat numbers as coarse signals warranting further analysis, not precise forecasts; prior journal publication (2016 JAIR) indicates publishable rigor.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.