This is super interesting, thanks for doing this! One question: how did you decide to put the tags in the buckets you did? I’m wondering as some things seem fairly arbitrary, and by drawing different boundaries you might actually get quite different results. For example, I was just checking out your tags script and saw that you have things like nuclear security, nuclear winter, etc. in “Catastrophic risks” rather than in “long_term_risks_and_flourishing” although I would say it could also fit in the latter category. I think this is especially true for these two categories, as most things in “catastrophic risks” would fit neatly into “long-term risks” e.g biosecurity, great power conflict, etc. If this was the case, the number of existential risks-related Forum posts would be much higher than you indicate (although the trends might still be similar, even if the absolute values are different).
I appreciate this might be an annoying nitpick as the categories will always be subjective, but thought this might change the results somewhat.
(P.S I was trying to run an amended version of this myself to check for myself but had some problems with your code (apparently tags has no attribute tag_types). Agreed with David below though, it would be nice to have a dynamic version so others could more easily re-run your code with slightly varied tagging.)
I have just gone off the assumption that whoever categorised the tags on this page, made a good judgement call. I agree completely that particularly longtermist stuff might look like a smaller fraction than it actually is, due to it being split across multiple categories. That said there are posts which fit under multiple longtermist categories which you’d have to ensure is not double-counted.
Thanks for the feedback, will put the code into a notebook when I have time tomorrow, should not take many minutes.
tags.tag_types causing you trouble is likely the python namespace giving you issues.
Anyways, I put all of the code into a notebook to make it easier to reproduce. I hope this is close to what you had in mind. Haven’t used these things much myself.
This is super interesting, thanks for doing this! One question: how did you decide to put the tags in the buckets you did? I’m wondering as some things seem fairly arbitrary, and by drawing different boundaries you might actually get quite different results. For example, I was just checking out your tags script and saw that you have things like nuclear security, nuclear winter, etc. in “Catastrophic risks” rather than in “long_term_risks_and_flourishing” although I would say it could also fit in the latter category. I think this is especially true for these two categories, as most things in “catastrophic risks” would fit neatly into “long-term risks” e.g biosecurity, great power conflict, etc. If this was the case, the number of existential risks-related Forum posts would be much higher than you indicate (although the trends might still be similar, even if the absolute values are different).
I appreciate this might be an annoying nitpick as the categories will always be subjective, but thought this might change the results somewhat.
(P.S I was trying to run an amended version of this myself to check for myself but had some problems with your code (apparently tags has no attribute tag_types). Agreed with David below though, it would be nice to have a dynamic version so others could more easily re-run your code with slightly varied tagging.)
Great question, I took the categories from here:
https://forum.effectivealtruism.org/tags/all
I have just gone off the assumption that whoever categorised the tags on this page, made a good judgement call. I agree completely that particularly longtermist stuff might look like a smaller fraction than it actually is, due to it being split across multiple categories. That said there are posts which fit under multiple longtermist categories which you’d have to ensure is not double-counted.
Thanks for the feedback, will put the code into a notebook when I have time tomorrow, should not take many minutes.
tags.tag_types causing you trouble is likely the python namespace giving you issues.
Anyways, I put all of the code into a notebook to make it easier to reproduce. I hope this is close to what you had in mind. Haven’t used these things much myself.
https://github.com/MperorM/ea-forum-analysis/blob/main/plots-notebook.ipynb