Hey, thanks for this. I work on CEA’s groups team. When you say “we don’t know much about which work … has the most impact on the outcomes we care about”—I think I would rather say
a) We have a reasonable, yet incomplete, view on how many people different groups cause to engage in EA, and some measure on what is the depth of that engagement
b) We are unsure how many of those people would have become engaged in EA anyway
c) We do not have a good mapping from “people engaging with EA” to the things that we actually want in the world
I think we should be sharing more of the data we have on what types of community building have, so far, seemed to generate more engagement. To this end we have a contractor who will be providing a centralized service for some community building tasks, to help spread what is working. I also think groups that seem to be performing well should be running experiments where other groups adopt their model. I have proposed this to several groups, and will continue to do so.
However trying to predict the mapping from engagement to good things happening in the world is (a) sufficiently difficult that I don’t think anyone can do it reliably (b) deeply unpleasant to a lot of communities. In trying to measure this we could decrease the amount of good that is happening in the world—and also probably wouldn’t succeed in taking the measurement accurately.
I think we should be sharing more of the data we have on what types of community building have, so far, seemed to generate more engagement. To this end we have a contractor who will be providing a centralized service for some community building tasks, to help spread what is working.
I’d love to see more sharing of data and what types of community building seem most effective. But I guess I’m confused as to how you’re assessing the latter. To what extent does this assessment incorporate control groups, even if imperfect (e.g. by comparing the number of engaged EAs a group generates before and after getting a paid organizer, or by comparing the trajectory of EAs generated by groups with paid organizers to that of groups without them?)
trying to predict the mapping from engagement to good things happening in the world is (a) sufficiently difficult that I don’t think anyone can do it reliably (b) deeply unpleasant to a lot of communities.
Yes, totally agree that trying to map from engagement to final outcomes is overkill. Thanks for clarifying this point. FWIW, the difficulty issue is the key factor for me, I was surprised by your “unpleasant to a lot of communities” comment. By that, are you referring to the dynamic where if you have to place value on outcomes, some people/orgs will be disappointed with the value you place on their work?
I also think groups that seem to be performing well should be running experiments where other groups adopt their model. I have proposed this to several groups, and will continue to do so.
This seems like another area where control groups would be helpful in making the exercise an actual experiment. Seems like a fairly easy place to introduce at least some randomization into, i.e. designate a pool of groups that could potentially benefit from adopting another group’s practices, and randomly select which of those groups actually do so. Presumably there would be some selection biases since some groups in the “adopt another group’s model” condition may decline to do so, but still potentially a step forward in measuring causality.
I was surprised by your “unpleasant to a lot of communities” comment. By that, are you referring to the dynamic where if you have to place value on outcomes, some people/orgs will be disappointed with the value you place on their work?
Not really. I was more referring that any attempt to quantify the likely impact someone will have is (a) inaccurate (b) likely to create some sort of hierarchy and unhealthy community dynamics.
This seems like another area where control groups would be helpful in making the exercise an actual experiment. Seems like a fairly easy place to introduce at least some randomization into
I agree with this, I like the idea of successful groups joining existing mentorship programs such that there is a natural control group of “average of all the other mentors.” (There are many ways this experiment would be imperfect, as I’m sure you can imagine) - I think the main implementation challenge here so far has been “getting groups to actually want to do this.” We are very careful to preserve the groups’ autonomy, I think this acts as a check on our behaviour. If groups engage on programs with us voluntarily, and we don’t make that engagement a condition of funding, it demonstrates that our programs are at least delivering value in the eyes of the organizers. If we started trying to claim more autonomy and started designating groups into experiments, we’d lose one of our few feedback measures. On balance I think I would prefer to have the feedback mechanism rather than the experiment. (The previous paragraph does contain some simplifications, it would certainly be possible to find examples of where we haven’t optimised purely for group autonomy)
Thanks for clarifying these points Rob. Agree that group autonomy is an important feedback loop, and that this feedback is more important than the experiment I suggested. But to the extent its possible to do experimentation on a voluntary basis, I do think that’d be valuable.
Hey, thanks for this. I work on CEA’s groups team. When you say “we don’t know much about which work … has the most impact on the outcomes we care about”—I think I would rather say
a) We have a reasonable, yet incomplete, view on how many people different groups cause to engage in EA, and some measure on what is the depth of that engagement
b) We are unsure how many of those people would have become engaged in EA anyway
c) We do not have a good mapping from “people engaging with EA” to the things that we actually want in the world
I think we should be sharing more of the data we have on what types of community building have, so far, seemed to generate more engagement. To this end we have a contractor who will be providing a centralized service for some community building tasks, to help spread what is working. I also think groups that seem to be performing well should be running experiments where other groups adopt their model. I have proposed this to several groups, and will continue to do so.
However trying to predict the mapping from engagement to good things happening in the world is (a) sufficiently difficult that I don’t think anyone can do it reliably (b) deeply unpleasant to a lot of communities. In trying to measure this we could decrease the amount of good that is happening in the world—and also probably wouldn’t succeed in taking the measurement accurately.
Thanks Rob, this is helpful!
I’d love to see more sharing of data and what types of community building seem most effective. But I guess I’m confused as to how you’re assessing the latter. To what extent does this assessment incorporate control groups, even if imperfect (e.g. by comparing the number of engaged EAs a group generates before and after getting a paid organizer, or by comparing the trajectory of EAs generated by groups with paid organizers to that of groups without them?)
Yes, totally agree that trying to map from engagement to final outcomes is overkill. Thanks for clarifying this point. FWIW, the difficulty issue is the key factor for me, I was surprised by your “unpleasant to a lot of communities” comment. By that, are you referring to the dynamic where if you have to place value on outcomes, some people/orgs will be disappointed with the value you place on their work?
This seems like another area where control groups would be helpful in making the exercise an actual experiment. Seems like a fairly easy place to introduce at least some randomization into, i.e. designate a pool of groups that could potentially benefit from adopting another group’s practices, and randomly select which of those groups actually do so. Presumably there would be some selection biases since some groups in the “adopt another group’s model” condition may decline to do so, but still potentially a step forward in measuring causality.
Not really. I was more referring that any attempt to quantify the likely impact someone will have is (a) inaccurate (b) likely to create some sort of hierarchy and unhealthy community dynamics.
I agree with this, I like the idea of successful groups joining existing mentorship programs such that there is a natural control group of “average of all the other mentors.” (There are many ways this experiment would be imperfect, as I’m sure you can imagine) - I think the main implementation challenge here so far has been “getting groups to actually want to do this.” We are very careful to preserve the groups’ autonomy, I think this acts as a check on our behaviour. If groups engage on programs with us voluntarily, and we don’t make that engagement a condition of funding, it demonstrates that our programs are at least delivering value in the eyes of the organizers. If we started trying to claim more autonomy and started designating groups into experiments, we’d lose one of our few feedback measures. On balance I think I would prefer to have the feedback mechanism rather than the experiment. (The previous paragraph does contain some simplifications, it would certainly be possible to find examples of where we haven’t optimised purely for group autonomy)
Thanks for clarifying these points Rob. Agree that group autonomy is an important feedback loop, and that this feedback is more important than the experiment I suggested. But to the extent its possible to do experimentation on a voluntary basis, I do think that’d be valuable.
I agree with this statement entirely.
Go team!