Thanks for your interest and also for raising this with us before you posted so I could post this response quickly!
I think you are asking about the first of these, but I’m going to include a few notes on the 2nd and 3rd too as well just in case, as there’s a way of hearing your question as about them.
What is the internal process by which these rankings are produced and where do you describe it?
What are problems and paths being ranked by? What does the ranking mean?
Where is our reasoning for why we rank each problem or path the way we do?
We’ve written some about these things on our site. We’re on the lookout for ways to improve our processes and how we communicate about them (e.g. I updated our research principles and process page this year and would be happy to add more info if it seemed important. If some of the additional notes below seem like they should be included that’d be helpful to hear.)
Here’s a summary of what we say now with some additional notes:
> Though most of our articles have a primary author, they are always reviewed by other members of the team before publication.
> For major research, we send drafts to several external researchers and people with experience in the area for feedback.
> We seek to proactively gather feedback on our most central positions — in particular, our views on the most pressing global problems and the career paths that have the highest potential for impact, via regularly surveying domain experts and generalist advisors who share our values.
> For some important questions, we assign a point person to gather input from inside and outside 80,000 Hours and determine our institutional position. For example, we do this with our list of the world’s most pressing problems, our page on the top most promising career paths, and some controversial topics, like whether to work at an AI lab. Ultimately, there is no formula for how to combine this input, so we make judgement calls [...] Final editorial calls on what goes on the website lie with our website director. [me, Arden]
> Finally, many of our articles are authored by outside experts. We still always review the articles ourselves to try to spot errors and ensure we buy the arguments being made by the author, but we defer to the author on the research (though we may update the article substantively later to keep it current).
Here are some additional details that aren’t on the page:
To reply to your specific question about aggregating people’s personal rankings: no, we don’t do any formal sort of ‘voting’ system like that. The problems and paths rankings are informed by the views of the staff at 80,000 Hours and external advisors via surveys where I elicit people’s personal rankings, and lots of ongoing internal discussion, but I am the “point person” for ultimately deciding how to combine this information into a ranking. In practice, this means my views can be expected to have an outsized influence, but I put a lot of emphasis on takes from others and aim for the lists to be something 80,000 Hours as an organisation can stand behind. Another big factor is what the lists were before, which I tend to view as a prior to update from, and which were informed by the research we did in the past and the views of people like like Ben Todd, Howie Lempel, and Rob Wiblin.
Our process has evolved over the years, and, for example, the formal “point person” system described above is recent as of this year (though it was informally something a bit like that before). I expect it’ll continue to change, and hopefully improve, especially as we grow the team (right now we have only 2 research staff).
Sometimes it’s been a while since we’ve looked at a problem or path, and we decide to re-do the article on it. That might trigger a change in ranking if we discover something that changes our minds.
More often we adjust the rankings over time without necessarily first re-doing the articles, often in response to surveys of advisors and team members, feedback we get, or events in the world. This might then trigger looking more into something and adding or re-doing a relevant article.
The rankings are not nearly as formal or quantitative as, e.g. the cost-effectiveness analyses that GiveWell performs of its top charities. Though previous versions of the site have included numerical weightings to something like the problem profiles list, we’ve moved away from that practice. We didn’t think the BOTECs and estimations that generated these kinds of numbers were actually driving our views, and the numbers they produced seemed like they suggested a misleading sense of precision. Ranking problems and career paths is messy and we aren’t able to be precise. We discuss our level of certainty in e.g. the problem profiles FAQ and at the end of the reserach principles page and try to reflect it in the language on the problems and career path pages.
As you noted, when we make a big change, like adding a new career path to the priority paths, we try to announce it in some prominent form, though we don’t always end up thinking it’s worth it. E.g. we sent a newsletter in April explaining why we now consider infosec to be a priority path. We made a similar announcement when we added AI hardware expertise to the priority paths. Our process for this isn’t very systematic.
On (2):
For problems: In EA shorthand, the ranking is via the ITN framework. We try to describe that in a more accessible / short way at the top of the page in the passage you quoted.
We also have an FAQ which talks a bit more about it.
For career paths it is slightly more complicated. A factor we weren’t able to fit into the passage you quoted is: we also down-rank paths if they are super narrow/most people can’t follow them (or don’t write about them at all) – e.g. becoming a public intellectual (or to take an extreme example, becoming president of the US.)
On (3):
For the most part, we want the articles themselves to explain our reasoning – in each problem profile or career review, we say why we think it’s as pressing / promising as we think it is.
We also draw on surveys of 80k staff + external advisors to additionally help determine and adjust the ranking over time, as described above. We don’t publish these surveys, but we describe the general type of person we tend to ask for input here.
Hi Arden, thanks for engaging like this on the forum!
Re: “the general type of person we tend to ask for input”—how do you treat the tradeoff between your advisors holding the values of longtermist effective altruism, and them being domain experts in the areas you recommend? (Of course, some people are both—but there are many insightful experts outside EA).
This is a good question—we don’t have a formal approach here, and I personally think that in general, it’s quite a hard problem who to ask for advice.
A few things to say:
the ideal is often to have both.
the bottleneck on getting more people with domain expertise is more often us not having people in our network with sufficient expertise, that we know about and believe are highly credible, and who are willing to give us their time, rather than their values. People who share our values tend to be more excited to work with us.
it depends a lot on the subject matter we are asking about. e.g. if it’s an article about how to become a great software engineer, we don’t care so much about the person’s values; we care about their software engineering credentials. If it’s e.g. an article about how to balance doing good and doing what you love, we care a lot more about their values
I like that question, Guy. Note 80,000 Hours lists their external advisors on their website. The list only has 6 people (Dr Greg Lewis, Dr Rohin Shah, Dr Toby Ord, Prof. Hilary Greaves, Peter Hartree and Alex Lawsen), and all are quite connected to effective altruism and longtermism. Arden, are these all the external advisors you were referring to in your comment?
No, we have lots of external advisors that aren’t listed on our site. There are a few reasons we might not list people, including:
We might not want to be committed to asking for someone’s advice for a long time or need to remove them at some point.
The person might be happy to help us and give input but not want to be featured on our site.
It’s work to add people, and we often will reach out to someone in our network fairly quickly and informally, and it would feel like overkill / too much friction to get a bio, and get permission from them for it, on our site for them because we asked them a few questions.
Also, there are too many people we get takes from over the course of e.g. a few years to list in a way that would give context and not require substantial person-hours of upkeep. So instead we just list some representative advisors who give us input on key subject matters we work on and where they have notable expertise.
or to take an extreme example, becoming president of the US
What is your thinking for not including this? I am asking as there might be people (you know better than me!) that might think it worthwhile to pursue this career even if it to them has a 0.01% chance of success. I am asking as there is existing EA advice about being ambitious, but is there advice that I have not seen about not being too ambitious? I feel like many people might “qualify” for becoming a president even if the chance of “making it” is low, so in one way it is perhaps not that narrow (even if there is only one 1st place). And on the way to this goal, people are likely to be managing large pots of money and/or making impactful policy more likely to happen.
I agree that it might be worthwhile to try to become the president of the US—but that wouldn’t mean it’s best for us to have an article on it, especially highly ranked. that takes real estate on our site, attention from readers, and time. This specific path is a sub-category of political careers, which we have several articles on. In the end, it is not possible for us to have profiles on every path that is potentially worthwhile for someone. My take is that it’s better for us to prioritise options where the described endpoint is achievable for at least a healthy handful of readers.
Thanks for your interest and also for raising this with us before you posted so I could post this response quickly!
Thanks for sharing the 1st version of your answer too, which prompted me to add a little more detail about what I was asking in the post.
If some of the additional notes below seem like they should be included that’d be helpful to hear.
I think it would be valuable to include all the additional notes which are not on your website. As a minimum viable product, you may want to link to your comment.
To reply to your specific question about aggregating people’s personal rankings: no, we don’t do any formal sort of ‘voting’ system like that. The problems and paths rankings are informed by the views of the staff at 80,000 Hours and external advisors via surveys where I elicit people’s personal rankings, and lots of ongoing internal discussion, but I am the “point person” for ultimately deciding how to combine this information into a ranking. In practice, this means my views can be expected to have an outsized influence, but I put a lot of emphasis on takes from others and aim for the lists to be something 80,000 Hours as an organisation can stand behind.
Thanks for sharing! The approach you are following seems to be analogous to what happens in the broader society, where there is often one single person responsible for informally aggregating various views. Using a formal aggregation method is the norm in forecasting circles. However, there are often many forecasts to be aggregated, so informal aggregation would hardly be feasible for most cases. On the other hand, Samotsvety, “a group of forecasters with a great track record”, alsouses formal aggregation methods. I am not aware of research comparing informal to formal aggregation of a few forecasts, so there might not be a strong case either way. In any case, I encourage you to try formal aggregation to see if you arrive to meaningfully different results.
Another big factor is what the lists were before, which I tend to view as a prior to update from, and which were informed by the research we did in the past and the views of people like like Ben Todd, Howie Lempel, and Rob Wiblin.
Makes sense.
The rankings are not nearly as formal or quantitative as, e.g. the cost-effectiveness analyses that GiveWell performs of its top charities. Though previous versions of the site have included numerical weightings to something like the problem profiles list, we’ve moved away from that practice. We didn’t think the BOTECs and estimations that generated these kinds of numbers were actually driving our views, and the numbers they produced seemed like they suggested a misleading sense of precision.
Your previous quantitative framework was equivalent to a weighted-factor model (WFM) with the logarithms of importance, tractability and neglectedness as factors with the same weight, such that the sum respects the logarithm of the cost-effectiveness. Have you considered trying a WFM with the factors that actually drive your views?
?I think it would be valuable to include all the additional notes which are not on your website. As a minimum viable product, you may want to link to your comment.
Thanks for your feedback here!
Your previous quantitative framework was equivalent to a weighted-factor model (WFM) with the logarithms of importance, tractability and neglectedness as factors with the same weight, such the sum respects the logarithm of the cost-effectiveness. Have you considered trying a WFM with the factors that actually drive your views?
I feel unsure about whether we should be trying to do another WFM at some point. There are a lot of ways we can improve our advice, and I’m not sure this should be at the top of our list but perhaps if/when we have more research capacity. I’d also guess it would still have the problem of giving a misleading sense of precision, so it’s not clear how much of an improvement it would be. But it is certainly true that the ITN framework substantially drives our views.
Hey Vasco —
Thanks for your interest and also for raising this with us before you posted so I could post this response quickly!
I think you are asking about the first of these, but I’m going to include a few notes on the 2nd and 3rd too as well just in case, as there’s a way of hearing your question as about them.
What is the internal process by which these rankings are produced and where do you describe it?
What are problems and paths being ranked by? What does the ranking mean?
Where is our reasoning for why we rank each problem or path the way we do?
We’ve written some about these things on our site. We’re on the lookout for ways to improve our processes and how we communicate about them (e.g. I updated our research principles and process page this year and would be happy to add more info if it seemed important. If some of the additional notes below seem like they should be included that’d be helpful to hear.)
Here’s a summary of what we say now with some additional notes:
On (1):
Our “Research principles and process” page is the best place to look for an overview, but it doesn’t describe everything.
I’ll quote a few relevant bits here:
> Though most of our articles have a primary author, they are always reviewed by other members of the team before publication.
> For major research, we send drafts to several external researchers and people with experience in the area for feedback.
> We seek to proactively gather feedback on our most central positions — in particular, our views on the most pressing global problems and the career paths that have the highest potential for impact, via regularly surveying domain experts and generalist advisors who share our values.
> For some important questions, we assign a point person to gather input from inside and outside 80,000 Hours and determine our institutional position. For example, we do this with our list of the world’s most pressing problems, our page on the top most promising career paths, and some controversial topics, like whether to work at an AI lab. Ultimately, there is no formula for how to combine this input, so we make judgement calls [...] Final editorial calls on what goes on the website lie with our website director. [me, Arden]
> Finally, many of our articles are authored by outside experts. We still always review the articles ourselves to try to spot errors and ensure we buy the arguments being made by the author, but we defer to the author on the research (though we may update the article substantively later to keep it current).
Here are some additional details that aren’t on the page:
To reply to your specific question about aggregating people’s personal rankings: no, we don’t do any formal sort of ‘voting’ system like that. The problems and paths rankings are informed by the views of the staff at 80,000 Hours and external advisors via surveys where I elicit people’s personal rankings, and lots of ongoing internal discussion, but I am the “point person” for ultimately deciding how to combine this information into a ranking. In practice, this means my views can be expected to have an outsized influence, but I put a lot of emphasis on takes from others and aim for the lists to be something 80,000 Hours as an organisation can stand behind. Another big factor is what the lists were before, which I tend to view as a prior to update from, and which were informed by the research we did in the past and the views of people like like Ben Todd, Howie Lempel, and Rob Wiblin.
Our process has evolved over the years, and, for example, the formal “point person” system described above is recent as of this year (though it was informally something a bit like that before). I expect it’ll continue to change, and hopefully improve, especially as we grow the team (right now we have only 2 research staff).
Sometimes it’s been a while since we’ve looked at a problem or path, and we decide to re-do the article on it. That might trigger a change in ranking if we discover something that changes our minds.
More often we adjust the rankings over time without necessarily first re-doing the articles, often in response to surveys of advisors and team members, feedback we get, or events in the world. This might then trigger looking more into something and adding or re-doing a relevant article.
The rankings are not nearly as formal or quantitative as, e.g. the cost-effectiveness analyses that GiveWell performs of its top charities. Though previous versions of the site have included numerical weightings to something like the problem profiles list, we’ve moved away from that practice. We didn’t think the BOTECs and estimations that generated these kinds of numbers were actually driving our views, and the numbers they produced seemed like they suggested a misleading sense of precision. Ranking problems and career paths is messy and we aren’t able to be precise. We discuss our level of certainty in e.g. the problem profiles FAQ and at the end of the reserach principles page and try to reflect it in the language on the problems and career path pages.
As you noted, when we make a big change, like adding a new career path to the priority paths, we try to announce it in some prominent form, though we don’t always end up thinking it’s worth it. E.g. we sent a newsletter in April explaining why we now consider infosec to be a priority path. We made a similar announcement when we added AI hardware expertise to the priority paths. Our process for this isn’t very systematic.
On (2):
For problems: In EA shorthand, the ranking is via the ITN framework. We try to describe that in a more accessible / short way at the top of the page in the passage you quoted.
We also have an FAQ which talks a bit more about it.
For career paths it is slightly more complicated. A factor we weren’t able to fit into the passage you quoted is: we also down-rank paths if they are super narrow/most people can’t follow them (or don’t write about them at all) – e.g. becoming a public intellectual (or to take an extreme example, becoming president of the US.)
On (3):
For the most part, we want the articles themselves to explain our reasoning – in each problem profile or career review, we say why we think it’s as pressing / promising as we think it is.
We also draw on surveys of 80k staff + external advisors to additionally help determine and adjust the ranking over time, as described above. We don’t publish these surveys, but we describe the general type of person we tend to ask for input here.
Best,
Arden
Hi Arden, thanks for engaging like this on the forum!
Re: “the general type of person we tend to ask for input”—how do you treat the tradeoff between your advisors holding the values of longtermist effective altruism, and them being domain experts in the areas you recommend? (Of course, some people are both—but there are many insightful experts outside EA).
This is a good question—we don’t have a formal approach here, and I personally think that in general, it’s quite a hard problem who to ask for advice.
A few things to say:
the ideal is often to have both.
the bottleneck on getting more people with domain expertise is more often us not having people in our network with sufficient expertise, that we know about and believe are highly credible, and who are willing to give us their time, rather than their values. People who share our values tend to be more excited to work with us.
it depends a lot on the subject matter we are asking about. e.g. if it’s an article about how to become a great software engineer, we don’t care so much about the person’s values; we care about their software engineering credentials. If it’s e.g. an article about how to balance doing good and doing what you love, we care a lot more about their values
I like that question, Guy. Note 80,000 Hours lists their external advisors on their website. The list only has 6 people (Dr Greg Lewis, Dr Rohin Shah, Dr Toby Ord, Prof. Hilary Greaves, Peter Hartree and Alex Lawsen), and all are quite connected to effective altruism and longtermism. Arden, are these all the external advisors you were referring to in your comment?
No, we have lots of external advisors that aren’t listed on our site. There are a few reasons we might not list people, including:
We might not want to be committed to asking for someone’s advice for a long time or need to remove them at some point.
The person might be happy to help us and give input but not want to be featured on our site.
It’s work to add people, and we often will reach out to someone in our network fairly quickly and informally, and it would feel like overkill / too much friction to get a bio, and get permission from them for it, on our site for them because we asked them a few questions.
Also, there are too many people we get takes from over the course of e.g. a few years to list in a way that would give context and not require substantial person-hours of upkeep. So instead we just list some representative advisors who give us input on key subject matters we work on and where they have notable expertise.
What is your thinking for not including this? I am asking as there might be people (you know better than me!) that might think it worthwhile to pursue this career even if it to them has a 0.01% chance of success. I am asking as there is existing EA advice about being ambitious, but is there advice that I have not seen about not being too ambitious? I feel like many people might “qualify” for becoming a president even if the chance of “making it” is low, so in one way it is perhaps not that narrow (even if there is only one 1st place). And on the way to this goal, people are likely to be managing large pots of money and/or making impactful policy more likely to happen.
I agree that it might be worthwhile to try to become the president of the US—but that wouldn’t mean it’s best for us to have an article on it, especially highly ranked. that takes real estate on our site, attention from readers, and time. This specific path is a sub-category of political careers, which we have several articles on. In the end, it is not possible for us to have profiles on every path that is potentially worthwhile for someone. My take is that it’s better for us to prioritise options where the described endpoint is achievable for at least a healthy handful of readers.
Thanks for the comprehensive reply, Arden!
Thanks for sharing the 1st version of your answer too, which prompted me to add a little more detail about what I was asking in the post.
I think it would be valuable to include all the additional notes which are not on your website. As a minimum viable product, you may want to link to your comment.
Thanks for sharing! The approach you are following seems to be analogous to what happens in the broader society, where there is often one single person responsible for informally aggregating various views. Using a formal aggregation method is the norm in forecasting circles. However, there are often many forecasts to be aggregated, so informal aggregation would hardly be feasible for most cases. On the other hand, Samotsvety, “a group of forecasters with a great track record”, also uses formal aggregation methods. I am not aware of research comparing informal to formal aggregation of a few forecasts, so there might not be a strong case either way. In any case, I encourage you to try formal aggregation to see if you arrive to meaningfully different results.
Makes sense.
Your previous quantitative framework was equivalent to a weighted-factor model (WFM) with the logarithms of importance, tractability and neglectedness as factors with the same weight, such that the sum respects the logarithm of the cost-effectiveness. Have you considered trying a WFM with the factors that actually drive your views?
Thanks for your feedback here!
I feel unsure about whether we should be trying to do another WFM at some point. There are a lot of ways we can improve our advice, and I’m not sure this should be at the top of our list but perhaps if/when we have more research capacity. I’d also guess it would still have the problem of giving a misleading sense of precision, so it’s not clear how much of an improvement it would be. But it is certainly true that the ITN framework substantially drives our views.