(Disclosure: I was an attendee at EAGx Australia in 2022 and 2023. I believe I am one of the data points in the Treatment group described.)
Thanks again for running this well-designed survey, which I know has taken a great deal of effort. The results do surprise me a little, and I notice that part of my motivation for writing this response is ‘I feel like the conferences are really valuable so I want to add alternate explanations that would support that belief.’ That said, I feel like some of my interpretations of this data might be of interest or add value to the conversation, so here goes.
The main thing that stands out to me in my interpretation of this data is that I think most EAs probably have an ‘EA ceiling’. By that I mean, there’s some maximum amount of engagement that each person is capable of, dependent on their circumstances. I think there may actually be two distinct cohorts of people who are representative of ceiling effects in the data.
The first cohort (‘personal ceiling’) are people who are doing everything they can, given their other goals and circumstances. I can’t increase my donations if I’m really struggling financially (and I think it’s important to acknowledge that Australia is in a major housing and cost-of-living crisis right now, which certainly affects my capacity to donate). I can’t attend more events if I’m a single parent, or doing shift work on a rigid schedule, or living in a regional town with no active EA community. These people are at a personal ceiling.
The second cohort (‘logical ceiling’) are people who basically already run their entire lives around EA principles (and I met several at EAGx). They’ve taken the 10% pledge, they work at EA orgs, they are vegan, they attend every EA event they reasonably can, they volunteer, they are active online, etc. It’s hard to imagine how people this committed could meaningfully increase their engagement with EA.
Given that attending EAGx requires a significant personal commitment of time and resources, it seems fairly obvious to me that conference attendees would be self-selected for BOTH ‘has free time and resources to attend the conference’ AND ‘higher EA engagement in comparison to people on the mailing list who didn’t attend the conference’. I think this is confirmed by the data: conference attendees had more EA friends and higher event attendance both before and after the conference. We should also consider that the survey response rate for non-conference-attendees was low, and the people who completed the survey are probably more engaged with EA than the average person on the mailing list. I think it would be really interesting to try to determine what percentage of respondents in each group are at either a personal or logical ceiling, and whether these ‘ceiling participants’ differ from other EAs in terms of the stability of their commitment and level of engagement over time. To resort to metaphor, it takes a lot more energy to keep a pot boiling than simmering, and it seems at least plausible to me that a large part of the value of EAGx is helping a relatively small group of extremely engaged people maintain their motivation, focus and commitment, and build new collaborations.
As an experimentalist (I’m a molecular biologist), the ‘obvious’ hypothesis test is one that was proposed in the OP: randomise would-be EAGx attendees into treatment and control groups, and then only let half of them attend. However, I think that using people at the borderline of being accepted or rejected as the basis for such a randomisation study would risk skewing the data. Specifically, I think it’s likely that everyone at a logical ceiling and most people at a personal ceiling would be an ‘automatic accept’ for EAGx and at no risk of being considered ‘borderline admits’. Therefore, the experimentally optimal way to run this would be to finalise the list of acceptances with 30 more acceptances than there are conference places, and then exclude 30 people totally at random. Unfortunately, there’s significant downside risk to such an approach. It’s likely that conference organisers, volunteers and speakers would be among those excluded, which would be disruptive to the conference and would likely reduce the value that other participants would get. I think it’s also important to consider that missing out on attending EAG or EAGx is a massive bummer; people have written before about feelings of unimportance or inadequacy as a result of conference rejection pushing them away from further participation in EA, and we should take this into account if considering running experiments that would involve arbitrarily declining qualified applications. (Edited to add: the experience of people who applied but weren’t selected for an EAGx, either because of a study or because they didn’t make the cut, is likely very different from the experience of EAs if no EAGx was held. FOMO/resentment for having personally missed out when others went is not similar to ‘oh, I hope there will be a conference next year’ or [crickets].)
I do also think that the metrics used in this study but not in ‘typical’ EAGx impact surveys are missing a lot of dimensions via which EAs have impact, especially those which are most relevant to people at a logical ceiling who are already working or volunteering within EA orgs for multiple hours a week. Metrics like ‘did you read books and forum posts, did you go to meetups, did you make friends’ are great for measuring engagement with EA ideas and community, but not great for measuring outputs like ‘Alex and Tsai had some great chats and have formed a technical collaboration’ or ‘Kate talked to Jess about her research and is now doing a PhD in her lab’ or ‘Kai inspired Josh to get professional mental health treatment and he’s now able to spend another 10 hours a week on effective work’. On this basis, I completely agree with the original post that we need to combine BOTH self-reports of effectiveness based on subjective measures like meaningful connections or feeling motivated, AND objective measures of behaviour change. I wonder whether it would be possible to incorporate more metrics that would ‘split the difference’ in a way, while still relying on self-reports of past behaviour (which is important for all the reasons discussed in OP). For instance, could we ask at a 6-month follow-up, ‘how many people that you met at EAGx have you interacted with in the past month’. Or, we could ask people to nominate specific actions they intended to take immediately after EAGx (with a control group of non-attendees) and then follow up 6 months later to ask them which of those actions they have actually taken. This design would be scalable to people with different levels of both engagement and ceiling-ness: a busy professional might commit to reading Scout Mindset and going to at least one meetup, while a student working to build an EA career might commit to applying for EAG, following up with 2 new connections and writing a forum post.
This is longer than I intended it to be, and I hope it doesn’t come across as critical—I think this is very important work, and that we should always be open to considering that beloved interventions are less effective than we would like for them to be. I hope this is a useful addition to the discussion, at any rate. And thanks James for all the work you put into EAGx Australia!
The second cohort (‘logical ceiling’) are people who basically already run their entire lives around EA principles (and I met several at EAGx). They’ve taken the 10% pledge, they work at EA orgs, they are vegan, they attend every EA event they reasonably can, they volunteer, they are active online, etc. It’s hard to imagine how people this committed could meaningfully increase their engagement with EA.
I think ‘engagement’ can be a misleading way to think about this: you can be fully engaged, but still increase your impact by changing how you spend you efforts.
Thinking back over my personal experience, three years ago I think I would probably be counted in this “fully engaged” cohort: I was donating 50%, writing publicly about EA, co-hosting our local EA group, had volunteered for EA organizations and at EA conferences, and was pretty active on the EA Forum. But since then I’ve switched careers from earning to give to direct work in biosecurity and am now leading a team at the NAO. I think my impact is significantly higher now (ex: I would likely reject an offer to resume earning to give at 5x my previous donation level), but the change here isn’t that I’m putting more of my time into EA-motivated work, but instead that (prompted by discussion with other EAs, and downstream from EA cause prioritization work) my EA-motivated work time is going into doing different things.
Yeah, I think this is an excellent point that you have made more clearly than I did: we are measuring engagement as a proxy for effectiveness. It might be a decent proxy for something like ‘probability of future effectiveness’ when considering young students in particular—if an intervention meaningfully increases the likelihood that some well-meaning undergrads make EA friends and read books and come to events, then I have at least moderate confidence that it also increases impact because some of those people will go on to make more impactful choices through their greater engagement with EA ideas. But I don’t think it’s a good proxy for the amount of impact being made by ‘people who basically run their whole lives around EA ideas already.’ It’s hard to imagine how these people could increase their ENGAGEMENT with EA (they’ve read all the books, they RUN the events, they’re friends with most people in the community, etc etc) but there are many ways they could increase their IMPACT, which may well be facilitated/prompted by EAGx but not captured by the data.
Out of curiosity, would you say that since switching careers, your engagement measured by these kind of metrics (books read, events attended, number of EA friends, frequency of forum activity, etc) has gone up, gone down, or stayed the same?
would you say that since switching careers, your engagement measured by these kind of metrics (books read, events attended, number of EA friends, etc) has gone up, gone down, or stayed the same?
I think it’s up, but a lot of that is pretty confounded by other things going in the community. For example, my five most-upvoted EA Forum posts are since switching careers, but several are about controversial community issues, and a lot of the recency effect goes away when looking at inflation-adjusted voting. I did attended EAG in 2023 for the first time since 2016, though, which was driven by wanting to talk to people about biosecurity.
I wonder if the Australian geographical context enhances the proportion of attendees who are at/near their ceiling. Most people in North American and Europe who are at/near ceiling will have a much wider range of conferences they can attend without significant travel time/expenses. They are less likely to attend EAGx given the marginal returns (and somewhat increasing costs) of attending a bunch of conferences. On the other hand, someone near/at ceiling in Australia (or another location far from most more selective conferences) may choose to attend EAGx Australia in large part because it is much more accessible to them.
This is a really good point actually! I have never attended either an EAG conference, or EAGx on another continent, so I don’t really have a frame of reference for how they generally compare. In Australia, EAGx is THE annual conference, and most of us put decently high priority on showing up if we can.
Thanks for the feedback Laura, I think the point about ceiling effects is really interesting. If we care about increasing the mean participation then that shouldn’t affect the conclusions (since it would be useless for people already at the ceiling), but if (as you suggest) the value is mostly coming from a handful of people maintaining/growing their engagement and networks then our method wouldn’t detect that. Detecting effects like that is hard and while it’s good practice to be skeptical of unobservable explanations, it doesn’t seem that implausible.
Perhaps trying to systematically look at the histories of people who are working in high-impact jobs and joined EA after ~2015 and tracing through interviews with them and their friends whether we think they’d have ended up somewhere equally impactful if not for attending EAGs. But that would necessarily involve huge assumptions about how impactful EAGs are already, so may not add much information.
I agree that randomizing almost-accepted people would be statistically great but not informative about the impacts of non-marginal people, and randomly excluding highly-qualified people would be too costly in my opinion. We specifically reached out to people who were accepted but didn’t attend for various reasons (which should be a good comparison point) but there’s nowhere near enough of them for EAGxAus to get statistical results. If this was done for all EAG(x)‘s for a few years we might actually get a great control group though!
We did consider having more questions and aiming more directly at the factors that are most indicative of direct impact but we decided on this compromise for two reasons: First, every extra question reduces the response rate. Given the 40% drop out and a small sample size I’d be reluctant to add too much. Second, questions that take time and through for people to answer is especially likely to lead to drop outs and inaccurate responses.
That said, leaving a text box for ‘what EA connections and opportunities have you found in the last 6 months?’ could be very powerful, though quanitifying the results would of require a lot of interpretation.
and randomly excluding highly-qualified people would be too costly in my opinion
I feel like if it would give high quality answers about how valuable such events are, it would be well worth the cost of random exclusion.
But this one feels more like “one to consider doing when you’re otherwise quite happy with the study design”, or something? And willing to invest more in follow-up or incentives to reduce drop-out rates.
Thanks for the response, I really like hearing about other people’s reasoning re: study design! I agree that randomly excluding highly qualified people would be too costly, and I think your idea of building a control group from accepted-cancelled EAGx attendees across multiple conferences is a great idea. I guess my only issue with it is that these people are likely still experiencing the FOMO (they wanted to go but couldn’t). If we are considering a counterfactual scenario where the resources currently used to organise EAGx conferences are spent on something else, there’s no conference to miss out on, so it removes a layer of experience related to ‘damn, I wish I could have gone to that’.
I’m not familiar enough with survey design to comment on the risk of adding more questions reducing the response rate. If you think it would be a big issue, that’s good enough for me—and also I imagine it would further skew the survey respondents towards more-engaged rather than less-engaged people. I do think that for the purpose of this survey, it would make more sense to prompt the EAGx attendees to answer whether they had followed up on any connections / ideas / opportunities from EAGx in the last 6 months. I’m not sure how to word that so that the same survey/questions could be used for both groups though.
(Disclosure: I was an attendee at EAGx Australia in 2022 and 2023. I believe I am one of the data points in the Treatment group described.)
Thanks again for running this well-designed survey, which I know has taken a great deal of effort. The results do surprise me a little, and I notice that part of my motivation for writing this response is ‘I feel like the conferences are really valuable so I want to add alternate explanations that would support that belief.’ That said, I feel like some of my interpretations of this data might be of interest or add value to the conversation, so here goes.
The main thing that stands out to me in my interpretation of this data is that I think most EAs probably have an ‘EA ceiling’. By that I mean, there’s some maximum amount of engagement that each person is capable of, dependent on their circumstances. I think there may actually be two distinct cohorts of people who are representative of ceiling effects in the data.
The first cohort (‘personal ceiling’) are people who are doing everything they can, given their other goals and circumstances. I can’t increase my donations if I’m really struggling financially (and I think it’s important to acknowledge that Australia is in a major housing and cost-of-living crisis right now, which certainly affects my capacity to donate). I can’t attend more events if I’m a single parent, or doing shift work on a rigid schedule, or living in a regional town with no active EA community. These people are at a personal ceiling.
The second cohort (‘logical ceiling’) are people who basically already run their entire lives around EA principles (and I met several at EAGx). They’ve taken the 10% pledge, they work at EA orgs, they are vegan, they attend every EA event they reasonably can, they volunteer, they are active online, etc. It’s hard to imagine how people this committed could meaningfully increase their engagement with EA.
Given that attending EAGx requires a significant personal commitment of time and resources, it seems fairly obvious to me that conference attendees would be self-selected for BOTH ‘has free time and resources to attend the conference’ AND ‘higher EA engagement in comparison to people on the mailing list who didn’t attend the conference’. I think this is confirmed by the data: conference attendees had more EA friends and higher event attendance both before and after the conference. We should also consider that the survey response rate for non-conference-attendees was low, and the people who completed the survey are probably more engaged with EA than the average person on the mailing list. I think it would be really interesting to try to determine what percentage of respondents in each group are at either a personal or logical ceiling, and whether these ‘ceiling participants’ differ from other EAs in terms of the stability of their commitment and level of engagement over time. To resort to metaphor, it takes a lot more energy to keep a pot boiling than simmering, and it seems at least plausible to me that a large part of the value of EAGx is helping a relatively small group of extremely engaged people maintain their motivation, focus and commitment, and build new collaborations.
As an experimentalist (I’m a molecular biologist), the ‘obvious’ hypothesis test is one that was proposed in the OP: randomise would-be EAGx attendees into treatment and control groups, and then only let half of them attend. However, I think that using people at the borderline of being accepted or rejected as the basis for such a randomisation study would risk skewing the data. Specifically, I think it’s likely that everyone at a logical ceiling and most people at a personal ceiling would be an ‘automatic accept’ for EAGx and at no risk of being considered ‘borderline admits’. Therefore, the experimentally optimal way to run this would be to finalise the list of acceptances with 30 more acceptances than there are conference places, and then exclude 30 people totally at random. Unfortunately, there’s significant downside risk to such an approach. It’s likely that conference organisers, volunteers and speakers would be among those excluded, which would be disruptive to the conference and would likely reduce the value that other participants would get. I think it’s also important to consider that missing out on attending EAG or EAGx is a massive bummer; people have written before about feelings of unimportance or inadequacy as a result of conference rejection pushing them away from further participation in EA, and we should take this into account if considering running experiments that would involve arbitrarily declining qualified applications. (Edited to add: the experience of people who applied but weren’t selected for an EAGx, either because of a study or because they didn’t make the cut, is likely very different from the experience of EAs if no EAGx was held. FOMO/resentment for having personally missed out when others went is not similar to ‘oh, I hope there will be a conference next year’ or [crickets].)
I do also think that the metrics used in this study but not in ‘typical’ EAGx impact surveys are missing a lot of dimensions via which EAs have impact, especially those which are most relevant to people at a logical ceiling who are already working or volunteering within EA orgs for multiple hours a week. Metrics like ‘did you read books and forum posts, did you go to meetups, did you make friends’ are great for measuring engagement with EA ideas and community, but not great for measuring outputs like ‘Alex and Tsai had some great chats and have formed a technical collaboration’ or ‘Kate talked to Jess about her research and is now doing a PhD in her lab’ or ‘Kai inspired Josh to get professional mental health treatment and he’s now able to spend another 10 hours a week on effective work’. On this basis, I completely agree with the original post that we need to combine BOTH self-reports of effectiveness based on subjective measures like meaningful connections or feeling motivated, AND objective measures of behaviour change. I wonder whether it would be possible to incorporate more metrics that would ‘split the difference’ in a way, while still relying on self-reports of past behaviour (which is important for all the reasons discussed in OP). For instance, could we ask at a 6-month follow-up, ‘how many people that you met at EAGx have you interacted with in the past month’. Or, we could ask people to nominate specific actions they intended to take immediately after EAGx (with a control group of non-attendees) and then follow up 6 months later to ask them which of those actions they have actually taken. This design would be scalable to people with different levels of both engagement and ceiling-ness: a busy professional might commit to reading Scout Mindset and going to at least one meetup, while a student working to build an EA career might commit to applying for EAG, following up with 2 new connections and writing a forum post.
This is longer than I intended it to be, and I hope it doesn’t come across as critical—I think this is very important work, and that we should always be open to considering that beloved interventions are less effective than we would like for them to be. I hope this is a useful addition to the discussion, at any rate. And thanks James for all the work you put into EAGx Australia!
I think ‘engagement’ can be a misleading way to think about this: you can be fully engaged, but still increase your impact by changing how you spend you efforts.
Thinking back over my personal experience, three years ago I think I would probably be counted in this “fully engaged” cohort: I was donating 50%, writing publicly about EA, co-hosting our local EA group, had volunteered for EA organizations and at EA conferences, and was pretty active on the EA Forum. But since then I’ve switched careers from earning to give to direct work in biosecurity and am now leading a team at the NAO. I think my impact is significantly higher now (ex: I would likely reject an offer to resume earning to give at 5x my previous donation level), but the change here isn’t that I’m putting more of my time into EA-motivated work, but instead that (prompted by discussion with other EAs, and downstream from EA cause prioritization work) my EA-motivated work time is going into doing different things.
Yeah, I think this is an excellent point that you have made more clearly than I did: we are measuring engagement as a proxy for effectiveness. It might be a decent proxy for something like ‘probability of future effectiveness’ when considering young students in particular—if an intervention meaningfully increases the likelihood that some well-meaning undergrads make EA friends and read books and come to events, then I have at least moderate confidence that it also increases impact because some of those people will go on to make more impactful choices through their greater engagement with EA ideas. But I don’t think it’s a good proxy for the amount of impact being made by ‘people who basically run their whole lives around EA ideas already.’ It’s hard to imagine how these people could increase their ENGAGEMENT with EA (they’ve read all the books, they RUN the events, they’re friends with most people in the community, etc etc) but there are many ways they could increase their IMPACT, which may well be facilitated/prompted by EAGx but not captured by the data.
Out of curiosity, would you say that since switching careers, your engagement measured by these kind of metrics (books read, events attended, number of EA friends, frequency of forum activity, etc) has gone up, gone down, or stayed the same?
I think it’s up, but a lot of that is pretty confounded by other things going in the community. For example, my five most-upvoted EA Forum posts are since switching careers, but several are about controversial community issues, and a lot of the recency effect goes away when looking at inflation-adjusted voting. I did attended EAG in 2023 for the first time since 2016, though, which was driven by wanting to talk to people about biosecurity.
I wonder if the Australian geographical context enhances the proportion of attendees who are at/near their ceiling. Most people in North American and Europe who are at/near ceiling will have a much wider range of conferences they can attend without significant travel time/expenses. They are less likely to attend EAGx given the marginal returns (and somewhat increasing costs) of attending a bunch of conferences. On the other hand, someone near/at ceiling in Australia (or another location far from most more selective conferences) may choose to attend EAGx Australia in large part because it is much more accessible to them.
This is a really good point actually! I have never attended either an EAG conference, or EAGx on another continent, so I don’t really have a frame of reference for how they generally compare. In Australia, EAGx is THE annual conference, and most of us put decently high priority on showing up if we can.
Thanks for the feedback Laura, I think the point about ceiling effects is really interesting. If we care about increasing the mean participation then that shouldn’t affect the conclusions (since it would be useless for people already at the ceiling), but if (as you suggest) the value is mostly coming from a handful of people maintaining/growing their engagement and networks then our method wouldn’t detect that. Detecting effects like that is hard and while it’s good practice to be skeptical of unobservable explanations, it doesn’t seem that implausible.
Perhaps trying to systematically look at the histories of people who are working in high-impact jobs and joined EA after ~2015 and tracing through interviews with them and their friends whether we think they’d have ended up somewhere equally impactful if not for attending EAGs. But that would necessarily involve huge assumptions about how impactful EAGs are already, so may not add much information.
I agree that randomizing almost-accepted people would be statistically great but not informative about the impacts of non-marginal people, and randomly excluding highly-qualified people would be too costly in my opinion. We specifically reached out to people who were accepted but didn’t attend for various reasons (which should be a good comparison point) but there’s nowhere near enough of them for EAGxAus to get statistical results. If this was done for all EAG(x)‘s for a few years we might actually get a great control group though!
We did consider having more questions and aiming more directly at the factors that are most indicative of direct impact but we decided on this compromise for two reasons: First, every extra question reduces the response rate. Given the 40% drop out and a small sample size I’d be reluctant to add too much. Second, questions that take time and through for people to answer is especially likely to lead to drop outs and inaccurate responses.
That said, leaving a text box for ‘what EA connections and opportunities have you found in the last 6 months?’ could be very powerful, though quanitifying the results would of require a lot of interpretation.
I feel like if it would give high quality answers about how valuable such events are, it would be well worth the cost of random exclusion.
But this one feels more like “one to consider doing when you’re otherwise quite happy with the study design”, or something? And willing to invest more in follow-up or incentives to reduce drop-out rates.
Thanks for the response, I really like hearing about other people’s reasoning re: study design! I agree that randomly excluding highly qualified people would be too costly, and I think your idea of building a control group from accepted-cancelled EAGx attendees across multiple conferences is a great idea. I guess my only issue with it is that these people are likely still experiencing the FOMO (they wanted to go but couldn’t). If we are considering a counterfactual scenario where the resources currently used to organise EAGx conferences are spent on something else, there’s no conference to miss out on, so it removes a layer of experience related to ‘damn, I wish I could have gone to that’.
I’m not familiar enough with survey design to comment on the risk of adding more questions reducing the response rate. If you think it would be a big issue, that’s good enough for me—and also I imagine it would further skew the survey respondents towards more-engaged rather than less-engaged people. I do think that for the purpose of this survey, it would make more sense to prompt the EAGx attendees to answer whether they had followed up on any connections / ideas / opportunities from EAGx in the last 6 months. I’m not sure how to word that so that the same survey/questions could be used for both groups though.