Do experts agree on what labs should do to make AI safer? Schuett et al’s (2023) survey (“Towards best practices in AGI safety and governance: A survey of expert opinion”) finds a broad consensus.
We organized two evaluations of this paper (link: summary and ratings, click through to the evaluations). Both evaluators rated the paper highly & found this work valuable & meaningful for policy & discussions of AI safety. They also highlighted important limitations, including sampling bias, classification of practices, interpretation of results, and abstract agreement vs. real-world implementation and gave suggestions for improvements and future work.
For example
Sampling/statement selection bias concerns:
Evaluator 1: ”… whether the sample of respondents is disproportionately drawn from individuals already aligned with AGI safety priorities”
Evaluator 2: …”where statements selected from labs practices are agreed on by labs themselves”
Abstract agreement vs. real-world implementation:
Evaluator 1: “items ~capture agreement in principle, rather than … [the] grappling with… tradeoffs … real-world governance inevitably entails.”
Evaluator 2: “agreement on practices could indicate a host of different things”; these caveats should should be incorporated more in the stated results.
The evaluation summary highlights other issues that we (as evaluator managers) believe merit further evaluation. We also point to a recent work that is related and complementary.
The Unjournal has published 37 evaluation packages (and 20 are in progress), mainly targeting impactful work in quantitative social science and economics across a range of outcomes and cause areas. See our output at https://unjournal.pubpub.org. See the list of over 200 papers with potential for impact (which we have/or are currently considering or evaluating) here.
Unjournal evaluation of “Towards best practices in AGI safety and governance” (Schuett et al, 2023)
Link post
Do experts agree on what labs should do to make AI safer? Schuett et al’s (2023) survey (“Towards best practices in AGI safety and governance: A survey of expert opinion”) finds a broad consensus.
We organized two evaluations of this paper (link: summary and ratings, click through to the evaluations). Both evaluators rated the paper highly & found this work valuable & meaningful for policy & discussions of AI safety. They also highlighted important limitations, including sampling bias, classification of practices, interpretation of results, and abstract agreement vs. real-world implementation and gave suggestions for improvements and future work.
For example
Sampling/statement selection bias concerns:
Evaluator 1: ”… whether the sample of respondents is disproportionately drawn from individuals already aligned with AGI safety priorities”
Evaluator 2: …”where statements selected from labs practices are agreed on by labs themselves”
Abstract agreement vs. real-world implementation:
Evaluator 1: “items ~capture agreement in principle, rather than … [the] grappling with… tradeoffs … real-world governance inevitably entails.”
Evaluator 2: “agreement on practices could indicate a host of different things”; these caveats should should be incorporated more in the stated results.
The evaluation summary highlights other issues that we (as evaluator managers) believe merit further evaluation. We also point to a recent work that is related and complementary.
The Unjournal has published 37 evaluation packages (and 20 are in progress), mainly targeting impactful work in quantitative social science and economics across a range of outcomes and cause areas. See our output at https://unjournal.pubpub.org. See the list of over 200 papers with potential for impact (which we have/or are currently considering or evaluating) here.