I am a researcher at Rethink Priorities’ Worldview Investigations Team. I also do work for Oxford’s Global Priorities Institute. Previously I was a research analyst at the Forethought Foundation for Global Priorities Research. I took the role after completing the MPhil in Economics at Oxford University. Before that, I studied Mathematics and Philosophy at the University of St Andrews.
Find out more about me here.
Thanks vasco. And thanks for helping us think through what we can do better. Some thoughts on this:
We considered several framings, scales and options to give experts. Since they were evaluating a lot of stances and we wanted experts to really know what we meant, we prioritised giving them context and then asking them the simplified general question of plausibility, with an intuitive scale. The exact question was: ‘how plausible do you find X stance?’, just after having fully describing X. We also asked them for general notes and comments and they didn’t seem to find that part of the survey particularly confusing (perhaps to your and my surprise). More broadly, I agree with you that sometimes perfectly defining terms and scales can help some people think through it but not everyone, and the science on how much it helps points is mixed.
We didn’t find that people were responding with zero plausibility very much at all. As you can see from the results, almost all respondents found most, if not all, stances at least a little bit plausible. I agree that had we found a lot of concentration around the very high or very low plausibility, having some sort of logarithmic scale could help distinguish results.
I’m not sure what you have in mind in terms of modelling the stances’ weight as distributions instead of point estimates. Perhaps you mean something like leveraging those distributions above via some sort of Monte Carlo where weights are drawn from these distributions and the process is repeated many times, then aggregated. That indeed sounds more sophisticated and could possibly help track uncertainty but I suspect it would very little difference. In particular, I think so because we observed that unweighted pooling of results across all stances is surprisingly similar to the pool when weighted by experts; the same if you squint.