Thanks so much for doing this analysis and writing this up! I’m curious whether there is a principled reason for using POC as a category, rather than focusing on specific ethnic groups that are underrepresented in EA, especially given what footnote 4 says about the breakdown of EAG attendees who are POC (24% Asian, 5% Hispanic, 2% Black, 1% multiracial). Some people have been critical of the term “POC” because they think it can gloss over this kind of information.
This is also an issue with “Asian”—it is such a broad category (3/5 of people!) that it combines groups with very different experiences. For example, among Burmese Americans 25% are classified by the US as living in poverty, compared to 6% of Indian Americans.
The question of what “underrepresented in EA” means is also pretty tricky, especially when you’re looking at conferences in multiple countries with different groups and histories. This summary seems to handle it by looking at the differences in breakdown between applicants, attendees, moderators, and speakers, but if people are left out of all of these groups that doesn’t show up in these stats.
Thanks! I conducted most of the analytics underlying the post. I sympathize with the issue you point out here! The explanation is kind of boring: the data has limitations that make more granular analyses tricky.
In 2022, the EA Global team collected race/ethnicity data exclusively using free-response fields in the application and feedback forms. For this post, we asked assistants working for the events team to hand code each unique response to two fields: (i) whether or not someone is POC, and (ii) which US Census race / ethnicity category this corresponded with. On (ii), I chose this mostly to be consistent with how e.g. the EA Survey in 2020 coded race/ethnicity data, and to allow for easier further analysis.
This second hand categorization is necessarily less accurate than what people would have marked themselves. In particular, our disaggregated race/ethnicity counts are probably less accurate than the “is POC” / “not POC” labeling. As an example, if someone reports they are “Thai / Indian”, I don’t have great guesses for whether they would have marked themselves down as “Asian” or “Multiracial”, but it seems fairly likely to me that they would fit under the “people of color” umbrella. Incidentally, I suspect this kind of issue might be why the EA Survey reports a much larger percentage of multiracial EAs than we do in our attendance numbers.
For speakers, as mentioned in the footnotes most speakers did not give us race / ethnicity data, and so I hand coded a binary “is POC” flag myself. For a variety of reasons coding a more granular flag would have taken much more effort, so we skipped that exercise.
As a second general problem, all of the data we are working with is pretty small, splitting the race/ethnicity data up more granularly makes each cohort smaller, and doing meaningful statistics on small samples is hard.
For the two above reasons, we presented mostly findings on the less granular level here. We might eventually take a look at this question, but I expect this would be a non-trivial lift, so we are currently not prioritizing it over other projects.
As an aside, the events team as a whole is conscious of the dynamic where the term “people of color” hides some important nuance, and doesn’t try to optimize for only this binary categorization when thinking about diversity considerations. (I no longer work on the EA Global team and am passing this on from speaking with the team.)
Reading this makes me think something like “what should the right number be”—I think the implication is like “even if we got to representative numbers of POC, we might still want to focus on specific sub groups” but then it seems like that argument can often be made.
Thanks so much for doing this analysis and writing this up! I’m curious whether there is a principled reason for using POC as a category, rather than focusing on specific ethnic groups that are underrepresented in EA, especially given what footnote 4 says about the breakdown of EAG attendees who are POC (24% Asian, 5% Hispanic, 2% Black, 1% multiracial). Some people have been critical of the term “POC” because they think it can gloss over this kind of information.
This is also an issue with “Asian”—it is such a broad category (3/5 of people!) that it combines groups with very different experiences. For example, among Burmese Americans 25% are classified by the US as living in poverty, compared to 6% of Indian Americans.
The question of what “underrepresented in EA” means is also pretty tricky, especially when you’re looking at conferences in multiple countries with different groups and histories. This summary seems to handle it by looking at the differences in breakdown between applicants, attendees, moderators, and speakers, but if people are left out of all of these groups that doesn’t show up in these stats.
Responded below.
Agreed. Needs work here.
Thanks! I conducted most of the analytics underlying the post. I sympathize with the issue you point out here! The explanation is kind of boring: the data has limitations that make more granular analyses tricky.
In 2022, the EA Global team collected race/ethnicity data exclusively using free-response fields in the application and feedback forms. For this post, we asked assistants working for the events team to hand code each unique response to two fields: (i) whether or not someone is POC, and (ii) which US Census race / ethnicity category this corresponded with. On (ii), I chose this mostly to be consistent with how e.g. the EA Survey in 2020 coded race/ethnicity data, and to allow for easier further analysis.
This second hand categorization is necessarily less accurate than what people would have marked themselves. In particular, our disaggregated race/ethnicity counts are probably less accurate than the “is POC” / “not POC” labeling. As an example, if someone reports they are “Thai / Indian”, I don’t have great guesses for whether they would have marked themselves down as “Asian” or “Multiracial”, but it seems fairly likely to me that they would fit under the “people of color” umbrella. Incidentally, I suspect this kind of issue might be why the EA Survey reports a much larger percentage of multiracial EAs than we do in our attendance numbers.
For speakers, as mentioned in the footnotes most speakers did not give us race / ethnicity data, and so I hand coded a binary “is POC” flag myself. For a variety of reasons coding a more granular flag would have taken much more effort, so we skipped that exercise.
As a second general problem, all of the data we are working with is pretty small, splitting the race/ethnicity data up more granularly makes each cohort smaller, and doing meaningful statistics on small samples is hard.
For the two above reasons, we presented mostly findings on the less granular level here. We might eventually take a look at this question, but I expect this would be a non-trivial lift, so we are currently not prioritizing it over other projects.
As an aside, the events team as a whole is conscious of the dynamic where the term “people of color” hides some important nuance, and doesn’t try to optimize for only this binary categorization when thinking about diversity considerations. (I no longer work on the EA Global team and am passing this on from speaking with the team.)
Reading this makes me think something like “what should the right number be”—I think the implication is like “even if we got to representative numbers of POC, we might still want to focus on specific sub groups” but then it seems like that argument can often be made.
I guess “what are we aiming for here?”
What information would say “this is good?”