I am the Principal Research Director at Rethink Priorities. I lead our Surveys and Data Analysis department and our Worldview Investigation Team.
The Worldview Investigation Team previously completed the Moral Weight Project and CURVE Sequence / Cross-Cause Model. We’re currently working on tools to help EAs decide how they should allocate resources within portfolios of different causes, and to how to use a moral parliament approach to allocate resources given metanormative uncertainty.
The Surveys and Data Analysis Team primarily works on private commissions for core EA movement and longtermist orgs, where we provide:
Private polling to assess public attitudes
Message testing / framing experiments, testing online ads
Expert surveys
Private data analyses and survey / analysis consultation
Impact assessments of orgs/programs
Formerly, I also managed our Wild Animal Welfare department and I’ve previously worked for Charity Science, and been a trustee at Charity Entrepreneurship and EA London.
My academic interests are in moral psychology and methodology at the intersection of psychology and philosophy.
Thanks Ben!
I don’t think there’s a single way to interpret the magnitude of the differences or the absolute scores (e.g a single effect size), so it’s best to examine this in a number of different ways.
One way to interpret the difference between the ratings is to look at the probability of superiority scores. For example, for Study 3 we showed that ~78% of people would be expected to rate longtermism AI safety (6.00) higher than longtermism (4.75). In contrast, for AI safety vs effective giving (5.65), it’s 61%, and for GCRR (5.95) it’s only about 51%.
You can also examine the (raw and weighted) distributions of the responses. This allows one to assess directly how many people “Like a great deal”, “Dislike a great deal” and so on.
You can also look at different measures, which have a more concrete interpretation than liking. We did this with one (interest in hearing more information about a topic). But in future studies we’ll include additional concrete measures, so we know e.g. how many people say they would get involved with x movement.
I agree that comparing these responses to other similar things outside of EA (like “positive action” but on the negative side) would be another useful way to compare the meaning of these responses.
One other thing to add is that the design of these studies isn’t optimised for assessing the effect of different names in absolute terms, because we every subject evaluated every item (“within-subjects”). This allows greater statistical power more cheaply, but the evaluations are also more likely to be implicitly comparative. To get an estimate of something like the difference in number of people who would be interested in x rather than y (assuming they would only encounter one or the other in the wild at a single time), we’d want to use a between-subjects design where people only evaluate one and indicate their interest in it.