Results demonstrated that FTX had decreased satisfaction by 0.5-1 points on a 10-point scale within the EA community, but overall community sentiment remained positive at ~7.5/10
That’s a big drop! In practice I’ve only ever seen this type of satisfaction scale give results between about 7⁄10 through 9.5/10 (which makes sense, right, if my satisfaction with EA is 3⁄10 then I’m probably not sticking around the community and answering member surveys), so that decline is a real big chunk of the scale’s de facto range.
I suppose it’s not surprising that the impact on perception is much bigger inside EA, where there’s (appropriately) been tons of discourse on this, than in the general public.
We can get a better intimation of the magnitude of the effect here with some further calculations. If we take all the people who have pre and post FTX satisfaction responses (n = 951), we see that 4% of them have a satisfaction score that went up, 53% remained the same, and 43% went down. That’s quite a striking negative impact. For those people whose scores went down, 67% had a reduction of only 1 point, 22% of 2 points, and then 7%, 3%, and 1% each for −3, −4, and −5 points.
We can also try to translate this effect into some more commonly used effect size metrics. Firstly, we can utilise a nice summary effect size metric for these ratings known as probability of superiority (PSup), which makes relatively few assumptions about the data—mainly that higher ratings are higher and lower ratings are lower, within the same respondent. This metric summarises the difference over time by taking the proportion of cases in which a score was higher pre-FTX (42.7%), and assigning a 50% weight to cases in which the score was the same from pre to post FTX (.5 * 53.2% = 26.6%), and adding these quantities together (69.3%). This metric is taken as an approximation of the proportion of people who would report being more satisfied before vs. after in a forced choice of being more or less satisfied. If everyone was more satisfied before, PSup would be 100%, if everyone was more satisfied after, PSup would be 0, and if it were just as likely for people to be more or less satisfied before or after, PSup would be 50%. In this case, we get a PSup of 69.3%. This corresponds to an effect size in standard deviation units (like Cohen’s d), of approximately .7.
We would encourage people not to just look up whether these are small or large effects in a table that would say e.g, from wikipedia, that .7 is in the ‘medium’ effect size bin. Think about how you would respond on this kind of question, what a difference of 1 or more points would mean in your head, and what precisely you think the proportions of people giving different responses substantively might mean to them. How one can best interpret effect sizes varies greatly with context
if my satisfaction with EA is 3⁄10 then I’m probably not sticking around the community and answering member surveys
I think this is a reasonable hypothesis, but there are also effects in the opposite direction (e.g. people who don’t care that much about EA don’t bother to track things very closely). Indeed, in this survey more engaged respondents were more concerned about EA leadership, not less.
Overall, Rethink didn’t find any difference in change in satisfaction based on engagement level. I’m not sure how the mechanism you propose affects things on net, but I definitely agree that FTX has had a greater impact on perception within EA rather than outside of it.
Nice point! I guess one should also have in mind that the 0.5 to 1 point decrease will not be permanent. I guess the effect of FTX on satisfaction in 2 years will be negligible, but I do not know.
Great, this is useful data.
That’s a big drop! In practice I’ve only ever seen this type of satisfaction scale give results between about 7⁄10 through 9.5/10 (which makes sense, right, if my satisfaction with EA is 3⁄10 then I’m probably not sticking around the community and answering member surveys), so that decline is a real big chunk of the scale’s de facto range.
I suppose it’s not surprising that the impact on perception is much bigger inside EA, where there’s (appropriately) been tons of discourse on this, than in the general public.
We can get a better intimation of the magnitude of the effect here with some further calculations. If we take all the people who have pre and post FTX satisfaction responses (n = 951), we see that 4% of them have a satisfaction score that went up, 53% remained the same, and 43% went down. That’s quite a striking negative impact. For those people whose scores went down, 67% had a reduction of only 1 point, 22% of 2 points, and then 7%, 3%, and 1% each for −3, −4, and −5 points.
We can also try to translate this effect into some more commonly used effect size metrics. Firstly, we can utilise a nice summary effect size metric for these ratings known as probability of superiority (PSup), which makes relatively few assumptions about the data—mainly that higher ratings are higher and lower ratings are lower, within the same respondent. This metric summarises the difference over time by taking the proportion of cases in which a score was higher pre-FTX (42.7%), and assigning a 50% weight to cases in which the score was the same from pre to post FTX (.5 * 53.2% = 26.6%), and adding these quantities together (69.3%). This metric is taken as an approximation of the proportion of people who would report being more satisfied before vs. after in a forced choice of being more or less satisfied. If everyone was more satisfied before, PSup would be 100%, if everyone was more satisfied after, PSup would be 0, and if it were just as likely for people to be more or less satisfied before or after, PSup would be 50%. In this case, we get a PSup of 69.3%. This corresponds to an effect size in standard deviation units (like Cohen’s d), of approximately .7.
We would encourage people not to just look up whether these are small or large effects in a table that would say e.g, from wikipedia, that .7 is in the ‘medium’ effect size bin. Think about how you would respond on this kind of question, what a difference of 1 or more points would mean in your head, and what precisely you think the proportions of people giving different responses substantively might mean to them. How one can best interpret effect sizes varies greatly with context
Thanks!
I think this is a reasonable hypothesis, but there are also effects in the opposite direction (e.g. people who don’t care that much about EA don’t bother to track things very closely). Indeed, in this survey more engaged respondents were more concerned about EA leadership, not less.
Overall, Rethink didn’t find any difference in change in satisfaction based on engagement level. I’m not sure how the mechanism you propose affects things on net, but I definitely agree that FTX has had a greater impact on perception within EA rather than outside of it.
Nice point! I guess one should also have in mind that the 0.5 to 1 point decrease will not be permanent. I guess the effect of FTX on satisfaction in 2 years will be negligible, but I do not know.
(Obviously we can’t put things on 0-10 scales but) I just want to add that a 0.5/10 decrease should be considered a medium-big drop.