I haven’t read the full report, but given the time sensitivity with commenting on forum posts, I wanted to quickly provide some information relevant to some of the 80k mentions in the qualitative comments, which were flagged to me.
Regarding whether we have public measures of our impact & what they show
It is indeed hard to measure how much our programmes counterfactually help move talent to high impact causes in a way that increases global welfare, but we do try to do this.
From the 2022 report the relevant section is here. Copying it in as there are a bunch of links.
We primarily use six sources of data to assess our impact:
Our own data about how users interact with our services (e.g. our historical metrics linked in the appendix).
Our and others’ impressions of the quality of our visible output.
Overall, we’d guess that 80,000 Hours continued to see diminishing returns to its impact per staff member per year. [But we continue to think it’s still cost-effective, even as it grows.]
Some elaboration:
DIPY estimates are our measure of contractual career plan shifts we think will be positive for the world. Unfortunately it’s hard to get an accurate read on counterfactuals and response rates, so these are only very rough estimates & we don’t put that much weight on them.
We report on things like engagement time & job board clicks as *lead metrics* because we think they tend to flow through to counterfactual high impact plan changes, & we’re able to measure them much more readily.
On the overall social impact that 80,000 Hours had on their career or career plans,
1021 (50%) said 80,000 Hours increased their impact
Within this we identified 266 who reported >30% chance of 80,000 Hours causing them to taking a new jobs or graduate course (a “criteria based plan change”)
26 (1%) said 80,000 Hours reduced their impact.
Themes in answers were demoralisation and causing career choices that were a poor fit
Open Philanthropy’s EA/LT survey was aimed at asking their respondents ““What was important in your journey towards longtermist priority work?” – it has a lot of different results and feels hard to summarise, but it showed a big chunk of people considered 80k a factor in ending up working where they are.
The 2020 EA survey link says “More than half (50.7%) of respondents cited 80,000 Hours as important for them getting involved in EA”. (2022 says something similar)
Regarding the extent to which we are cause neutral & whether we’ve been misleading about this
We do strive to be cause neutral, in the sense that we try to prioritize working on the issues where we think we can have the highest marginal impact (rather than committing to a particular cause for other reasons).
For the past several years we’ve thought that the most pressing problem is AI safety, so have put much of our effort there (Some 80k programmes focus on it more than others – I reckon for some it’s a majority, but it hasn’t been true that as an org we “almost exclusively focus on AI risk.” (a bit more on that here.))
In other words, we’re cause neutral, but not cause *agnostic* - we have a view about what’s most pressing. (Of course we could be wrong or thinking about this badly, but I take that to be a different concern.)
The most prominent place we describe our problem prioritization is our problem profiles page – which is one of our most popular pages. We describe our list of issues this way: “These areas are ranked roughly by our guess at the expected impact of an additional person working on them, assuming your ability to contribute to solving each is similar (though there’s a lot of variation in the impact of work within each issue as well). (Here’s also a past comment from me on a related issue.)
Regarding the concern about us harming talented EAs by causing them to choose bad early career jobs
To the extent that this has happened this is quite serious – helping talented people have higher impact careers is our entire point! I think we will always sometimes fail to give good advice (given the diversity & complexity of people’s situations & the world), but we do try to aggressively minimise negative impacts, and if people think any particular part of our advice is unhelpful, we’d like them to contact us about it! (I’m arden@80000hours.org & I can pass them on to the relevant people.)
We do also try to find evidence of negative impact, e.g. using our user survey, and it seems dramatically less common than the positive impact (see the stats above), though there are of course selection effects with that kind of method so one can’t take that at face value!
Regarding our advice on working at AI companies and whether this increases AI risk
This is a good worry and we talk a lot about this internally! We wrote about this here.
The 2020 EA survey link says “More than half (50.7%) of respondents cited 80,000 Hours as important for them getting involved in EA”. (2022 says something similar)
I would also add these results, which I think are, if anything, even more relevant to assessing impact:
(1) Impact measures: I’m very appreciative of the amount of thought that went into developing the DIPY measure. The main concern (from the outside) with respect to DIPY is that it is critically dependent on the impact-adjustment variable—it’s probably the single biggest driver of uncertainty (since causes can vary by many magnitudes). Depending on whether you think the work is impactful (or if you’re sceptical, e.g. because you’re an AGI sceptic or because you’re convinced of the importance of preventing AGI risk but worried about counterproductivity from getting people into AI etc), the estimate will fluctuate very heavily (and could be zero or significantly negative). From the perspective of an external funder, it’s hard to be convinced of robust cost-effectiveness (or speaking for myself, as a researcher, it’s hard to validate).
(2) I think we would both agree that AGI (and to a lesser extent, GCR more broadly) is 80,000 Hour’s primary focus.
I suppose the disagreement then is the extent to which neartermist work gets any focus at all. This is to some extent subjective, and also dependent on hard-to-observe decision-making and resource-allocation done internally. With (a) the team not currently planning to focus on neartermist content for the website (the most visible thing), (b) the career advisory/1-1 work being very AGI-focused too (to my understanding), and (c) fundamentally, OP being 80,000 Hour’s main funder, and all of OP’s 80k grants being from the GCR capacity building team over the past 2-3 years—I think from an outside perspective, a reasonable assumption is that AGI/GCR is >=75% of marginal resources committed. I exclude the job board from analysis here because I understand it absorbs comparatively little internal FTE right now.
The other issue we seem to disagree on is whether 80k has made its prioritization sufficiently obvious. It appreciate that this is somewhat subjective, but it might be worth erring on the side of being too obvious here—I think the relevant metric would be “Does a average EA who looks at the job board or signs up for career consulting understand that 80,000 Hours prefers I prioritize AGI?”, and I’m not sure that’s the case right now.
(3) Bad career jobs—this was a concern aired, but we didn’t have too much time to investigate it—we just flag it out as a potential risk for people to consider.
(4) Similarly, we deprioritized the issue of whether getting people into AI companies worsens AI risk. We leave it up to potential donors to be something they might have to weigh and consider the pros and cons of (e.g. per Ben’s article) and to make their decisions accordingly.
Hey Joel, I’m wondering if you have recommendations on (1) or on the transparency/clarity element of (2)?
(Context being that I think 80k do a good job on these things, and I expect I’m doing a less good job on the equivalents in my own talent search org. Having a sense of what an ‘even better’ version might look like could help shift my sort of internal/personal overton window of possibilities.)
For (1) I’m agree with 80k’s approach in theory—it’s just that cost-effectiveness is likely heavily driven by the cause-level impact adjustment—so you’ll want to model that in a lot of detail.
For (2), I think just declaring up front what you think is the most impactful cause(s) and what you’re focusing on is pretty valuable? And I suppose when people do apply/email, it’s worth making that sort of caveat as well. For our own GHD grantmaking, we do try to declare on our front page that our current focus is NCD policy and also if someone approaches us raising the issue of grants, we make clear what our current grant cycle is focused on.
Makes sense on (1). I agree that this kind of methodology is not very externally legible and depends heavily on cause prioritisation, sub-cause prioritisation, your view on the most impactful interventions, etc. I think it’s worth tracking for internal decision-making even if external stakeholders might not agree with all the ratings and decisions. (The system I came up with for Animal Advocacy Careers’ impact evaluation suffered similar issues within animal advocacy.)
For (2), I’m not sure why you don’t think 80 do this. E.g. the page on “What are the most pressing world problems?” has the following opening paragraph:
Then the actual ranking is very clear: AI 1, pandemics 2, nuclear war 3, etc.
And the advising page says quite prominently “We’re most helpful for people who… Are interested in the problems we think are most pressing, which you can read about in our problem profiles.” The FAQ on “What are you looking for in the application?” mentions that one criterion is “Are interested in working on our pressing problems”.
Of course it would be possible to make it more prominent, but it seems like they’ve put these things pretty clearly on the front.
It seems pretty reasonable to me that 80k would want to talk to people who seem promising but don’t share all the same cause prio views as them; supporting people to think through cause prio seems like a big way they can add value. So I wouldn’t expect them to try to actively deter people who sign up and seem worth advising but, despite the clear labelling on the advising page, don’t already share the same cause prio rankings as 80k. You also suggest “when people do apply/email, it’s worth making that sort of caveat as well”, and that seems in the active deterrence ballpark to me; to the effect of ‘hey are you sure you want this call?’
On (2). If you go to 80k’s front page (https://80000hours.org/), there is no mention that the organizational’s focus is AGI or that they believe it to be the most important cause. For the other high-level pages accessible from the navigation bar, things are similar not obvious. For example, in “Start Here”, you have to read 22 paragraphs down to understand 80k’s explicit prioritization of x-risk over other causes. In the “Career Guide”, it’s about halfway down the page. If the 1-1 advising tab, you have to go down to the FAQs at the bottom of the page, and even then it only refers to “pressing problems” and links back to the research page. And on the research page itself, the issue is that it doesn’t give a sense that the organization strongly recommends AI over the rest, or that x-risk gets the lion’s share of organizational resources.
I’m not trying to be nitpicky, but trying to convey that a lot of less engaged EAs (or people who are just considering impactful careers) are coming in, reading the website, and maybe browsing the job board or thinking of applying for advising—without realizing just how convinced on AGI 80k is (and correspondingly, not realizing how strongly they will be sold on AGI in advisory calls). This may not just be less engaged EAs too, depending on how you defined engaged—like I was reading Singer since two decades ago; have been a GWWC pledger since 2014; and whenever giving to GiveWell have actually taken the time to examine their CEAs and research reports. And yet until I actually moved into direct EA work via the CE incubation program, I didn’t realize how AGI-focused 80k was.
People will never get the same mistaken impression when looking at Non-Linear or Lightcone or BERI or SFF. I think part of the problem is (a) putting up a lot of causes on the problems page, which gives the reader the impression of a big tent/broad focus, and (b) having normie aesthetics (compare: longtermist websites). While I do think it’s correct and valuable to do both, the downside is that without more explicit clarification (e.g. what Non-Linear does, just bluntly saying on the front page in font 40: “We incubate AI x-risk nonprofits by connecting founders with ideas, funding, and mentorship”), the casual reader of the website doesn’t understand that 80k basically works on AGI.
I suspect the crux might be that I don’t necessarily think it’s a bad thing if “the casual reader of the website doesn’t understand that 80k basically works on AGI”. E.g. if 80k adds value to someone as they go through the career guide, even if they don’ realise that “the organization strongly recommends AI over the rest, or that x-risk gets the lion’s share of organizational resources”, is there a problem?
I would be concerned if 80k was not adding value. E.g. I can imagine more salesly tactics that look like making a big song and dance about how much the reader needs their advice, without providing any actual guidance until they deliver the final pitch, where the reader is basically given the choice of signing up for 80k’s view/service, or looking for some alternative provider/resource that can help them. But I don’t think that that’s happening here.
I can also imagine being concerned if the service was not transparent until you were actually on the call, and then you received some sort of unsolicited cause prioritisation pitch. But again, I don’t think that’s what’s happening; as discussed, it’s pretty transparent on the advising page and cause prio page what they’re doing.
Hey, Arden from 80,000 Hours here –
I haven’t read the full report, but given the time sensitivity with commenting on forum posts, I wanted to quickly provide some information relevant to some of the 80k mentions in the qualitative comments, which were flagged to me.
Regarding whether we have public measures of our impact & what they show
It is indeed hard to measure how much our programmes counterfactually help move talent to high impact causes in a way that increases global welfare, but we do try to do this.
From the 2022 report the relevant section is here. Copying it in as there are a bunch of links.
Some elaboration:
DIPY estimates are our measure of contractual career plan shifts we think will be positive for the world. Unfortunately it’s hard to get an accurate read on counterfactuals and response rates, so these are only very rough estimates & we don’t put that much weight on them.
We report on things like engagement time & job board clicks as *lead metrics* because we think they tend to flow through to counterfactual high impact plan changes, & we’re able to measure them much more readily.
Headlines from some of the links above:
From our own survey (2138 respondents):
On the overall social impact that 80,000 Hours had on their career or career plans,
1021 (50%) said 80,000 Hours increased their impact
Within this we identified 266 who reported >30% chance of 80,000 Hours causing them to taking a new jobs or graduate course (a “criteria based plan change”)
26 (1%) said 80,000 Hours reduced their impact.
Themes in answers were demoralisation and causing career choices that were a poor fit
Open Philanthropy’s EA/LT survey was aimed at asking their respondents ““What was important in your journey towards longtermist priority work?” – it has a lot of different results and feels hard to summarise, but it showed a big chunk of people considered 80k a factor in ending up working where they are.
The 2020 EA survey link says “More than half (50.7%) of respondents cited 80,000 Hours as important for them getting involved in EA”. (2022 says something similar)
Regarding the extent to which we are cause neutral & whether we’ve been misleading about this
We do strive to be cause neutral, in the sense that we try to prioritize working on the issues where we think we can have the highest marginal impact (rather than committing to a particular cause for other reasons).
For the past several years we’ve thought that the most pressing problem is AI safety, so have put much of our effort there (Some 80k programmes focus on it more than others – I reckon for some it’s a majority, but it hasn’t been true that as an org we “almost exclusively focus on AI risk.” (a bit more on that here.))
In other words, we’re cause neutral, but not cause *agnostic* - we have a view about what’s most pressing. (Of course we could be wrong or thinking about this badly, but I take that to be a different concern.)
The most prominent place we describe our problem prioritization is our problem profiles page – which is one of our most popular pages. We describe our list of issues this way: “These areas are ranked roughly by our guess at the expected impact of an additional person working on them, assuming your ability to contribute to solving each is similar (though there’s a lot of variation in the impact of work within each issue as well). (Here’s also a past comment from me on a related issue.)
Regarding the concern about us harming talented EAs by causing them to choose bad early career jobs
To the extent that this has happened this is quite serious – helping talented people have higher impact careers is our entire point! I think we will always sometimes fail to give good advice (given the diversity & complexity of people’s situations & the world), but we do try to aggressively minimise negative impacts, and if people think any particular part of our advice is unhelpful, we’d like them to contact us about it! (I’m arden@80000hours.org & I can pass them on to the relevant people.)
We do also try to find evidence of negative impact, e.g. using our user survey, and it seems dramatically less common than the positive impact (see the stats above), though there are of course selection effects with that kind of method so one can’t take that at face value!
Regarding our advice on working at AI companies and whether this increases AI risk
This is a good worry and we talk a lot about this internally! We wrote about this here.
I would also add these results, which I think are, if anything, even more relevant to assessing impact:
80,000 Hours is the second most commonly cited factor for “having the largest impact on one’s personal ability to have a positive impact” (after “Personal contact with EAs, so it’s the largest substantive program, org or service), being cited by 31.4% of EAs.
The 80K website alone is comparable to EA Global or the EA Forum in cited impact.
It’s also the most commonly cited factor, by a dramatic margin, for causing EAs to learn something important in the last year, being cited by 40% of EAs.
It’s also the second most important factor for people hearing about EA in the first place (13.5% of EAs).
Hi Arden,
Thanks for engaging.
(1) Impact measures: I’m very appreciative of the amount of thought that went into developing the DIPY measure. The main concern (from the outside) with respect to DIPY is that it is critically dependent on the impact-adjustment variable—it’s probably the single biggest driver of uncertainty (since causes can vary by many magnitudes). Depending on whether you think the work is impactful (or if you’re sceptical, e.g. because you’re an AGI sceptic or because you’re convinced of the importance of preventing AGI risk but worried about counterproductivity from getting people into AI etc), the estimate will fluctuate very heavily (and could be zero or significantly negative). From the perspective of an external funder, it’s hard to be convinced of robust cost-effectiveness (or speaking for myself, as a researcher, it’s hard to validate).
(2) I think we would both agree that AGI (and to a lesser extent, GCR more broadly) is 80,000 Hour’s primary focus.
I suppose the disagreement then is the extent to which neartermist work gets any focus at all. This is to some extent subjective, and also dependent on hard-to-observe decision-making and resource-allocation done internally. With (a) the team not currently planning to focus on neartermist content for the website (the most visible thing), (b) the career advisory/1-1 work being very AGI-focused too (to my understanding), and (c) fundamentally, OP being 80,000 Hour’s main funder, and all of OP’s 80k grants being from the GCR capacity building team over the past 2-3 years—I think from an outside perspective, a reasonable assumption is that AGI/GCR is >=75% of marginal resources committed. I exclude the job board from analysis here because I understand it absorbs comparatively little internal FTE right now.
The other issue we seem to disagree on is whether 80k has made its prioritization sufficiently obvious. It appreciate that this is somewhat subjective, but it might be worth erring on the side of being too obvious here—I think the relevant metric would be “Does a average EA who looks at the job board or signs up for career consulting understand that 80,000 Hours prefers I prioritize AGI?”, and I’m not sure that’s the case right now.
(3) Bad career jobs—this was a concern aired, but we didn’t have too much time to investigate it—we just flag it out as a potential risk for people to consider.
(4) Similarly, we deprioritized the issue of whether getting people into AI companies worsens AI risk. We leave it up to potential donors to be something they might have to weigh and consider the pros and cons of (e.g. per Ben’s article) and to make their decisions accordingly.
Hey Joel, I’m wondering if you have recommendations on (1) or on the transparency/clarity element of (2)?
(Context being that I think 80k do a good job on these things, and I expect I’m doing a less good job on the equivalents in my own talent search org. Having a sense of what an ‘even better’ version might look like could help shift my sort of internal/personal overton window of possibilities.)
Hi Jamie,
For (1) I’m agree with 80k’s approach in theory—it’s just that cost-effectiveness is likely heavily driven by the cause-level impact adjustment—so you’ll want to model that in a lot of detail.
For (2), I think just declaring up front what you think is the most impactful cause(s) and what you’re focusing on is pretty valuable? And I suppose when people do apply/email, it’s worth making that sort of caveat as well. For our own GHD grantmaking, we do try to declare on our front page that our current focus is NCD policy and also if someone approaches us raising the issue of grants, we make clear what our current grant cycle is focused on.
Hope my two cents is somewhat useful!
Makes sense on (1). I agree that this kind of methodology is not very externally legible and depends heavily on cause prioritisation, sub-cause prioritisation, your view on the most impactful interventions, etc. I think it’s worth tracking for internal decision-making even if external stakeholders might not agree with all the ratings and decisions. (The system I came up with for Animal Advocacy Careers’ impact evaluation suffered similar issues within animal advocacy.)
For (2), I’m not sure why you don’t think 80 do this. E.g. the page on “What are the most pressing world problems?” has the following opening paragraph:
Then the actual ranking is very clear: AI 1, pandemics 2, nuclear war 3, etc.
And the advising page says quite prominently “We’re most helpful for people who… Are interested in the problems we think are most pressing, which you can read about in our problem profiles.” The FAQ on “What are you looking for in the application?” mentions that one criterion is “Are interested in working on our pressing problems”.
Of course it would be possible to make it more prominent, but it seems like they’ve put these things pretty clearly on the front.
It seems pretty reasonable to me that 80k would want to talk to people who seem promising but don’t share all the same cause prio views as them; supporting people to think through cause prio seems like a big way they can add value. So I wouldn’t expect them to try to actively deter people who sign up and seem worth advising but, despite the clear labelling on the advising page, don’t already share the same cause prio rankings as 80k. You also suggest “when people do apply/email, it’s worth making that sort of caveat as well”, and that seems in the active deterrence ballpark to me; to the effect of ‘hey are you sure you want this call?’
On (2). If you go to 80k’s front page (https://80000hours.org/), there is no mention that the organizational’s focus is AGI or that they believe it to be the most important cause. For the other high-level pages accessible from the navigation bar, things are similar not obvious. For example, in “Start Here”, you have to read 22 paragraphs down to understand 80k’s explicit prioritization of x-risk over other causes. In the “Career Guide”, it’s about halfway down the page. If the 1-1 advising tab, you have to go down to the FAQs at the bottom of the page, and even then it only refers to “pressing problems” and links back to the research page. And on the research page itself, the issue is that it doesn’t give a sense that the organization strongly recommends AI over the rest, or that x-risk gets the lion’s share of organizational resources.
I’m not trying to be nitpicky, but trying to convey that a lot of less engaged EAs (or people who are just considering impactful careers) are coming in, reading the website, and maybe browsing the job board or thinking of applying for advising—without realizing just how convinced on AGI 80k is (and correspondingly, not realizing how strongly they will be sold on AGI in advisory calls). This may not just be less engaged EAs too, depending on how you defined engaged—like I was reading Singer since two decades ago; have been a GWWC pledger since 2014; and whenever giving to GiveWell have actually taken the time to examine their CEAs and research reports. And yet until I actually moved into direct EA work via the CE incubation program, I didn’t realize how AGI-focused 80k was.
People will never get the same mistaken impression when looking at Non-Linear or Lightcone or BERI or SFF. I think part of the problem is (a) putting up a lot of causes on the problems page, which gives the reader the impression of a big tent/broad focus, and (b) having normie aesthetics (compare: longtermist websites). While I do think it’s correct and valuable to do both, the downside is that without more explicit clarification (e.g. what Non-Linear does, just bluntly saying on the front page in font 40: “We incubate AI x-risk nonprofits by connecting founders with ideas, funding, and mentorship”), the casual reader of the website doesn’t understand that 80k basically works on AGI.
Yeah many of those things seem right to me.
I suspect the crux might be that I don’t necessarily think it’s a bad thing if “the casual reader of the website doesn’t understand that 80k basically works on AGI”. E.g. if 80k adds value to someone as they go through the career guide, even if they don’ realise that “the organization strongly recommends AI over the rest, or that x-risk gets the lion’s share of organizational resources”, is there a problem?
I would be concerned if 80k was not adding value. E.g. I can imagine more salesly tactics that look like making a big song and dance about how much the reader needs their advice, without providing any actual guidance until they deliver the final pitch, where the reader is basically given the choice of signing up for 80k’s view/service, or looking for some alternative provider/resource that can help them. But I don’t think that that’s happening here.
I can also imagine being concerned if the service was not transparent until you were actually on the call, and then you received some sort of unsolicited cause prioritisation pitch. But again, I don’t think that’s what’s happening; as discussed, it’s pretty transparent on the advising page and cause prio page what they’re doing.