Program Associate on Open Philâs Global Catastrophic Risks Capacity Building team.
đ¸ GWWC Pledger
Michael Townsendđ¸
Speaking personally, I have also perceived a move away from longtermism, and as someone who finds longtermism very compelling, this has been disappointing to see. I agree it has substantive implications on what we prioritise.
Speaking more on behalf of GWWC, where I am a researcher: our motivation for changing our cause area from âcreating a better futureâ to âreducing global catastrophic risksâ really was not based on PR. As shared here:
We think of a âhigh-impact cause areaâ as a collection of causes that, for donors with a variety of values and starting assumptions (âworldviewsâ), provide the most promising philanthropic funding opportunities. Donors with different worldviews might choose to support the same cause area for different reasons. For example, some may donate to global catastrophic risk reduction because they believe this is the best way to reduce the risk of human extinction and thereby safeguard future generations, while others may do so because they believe the risk of catastrophes in the next few decades is sufficiently large and tractable that it is the best way to help people alive today.
Essentially, weâre aiming to use the term âreducing global catastrophic risksâ as a kind of superset that includes reducing existential risk, and that is inclusive of all the potential motivations. For example, when looking for recommendations in this area, we would be happy to include recommendations that only make sense from a longtermist perspective. A large part of the motivation for this was based on finding some of the arguments made in several of the posts you linked (including âEA and Longtermism: not a crux for saving the worldâ) compelling.
Also, our decision to step down from managing the communications for the Longtermism Fund (now âEmerging Challenges Fundâ) was based on wanting to be able to more independently evaluate Longviewâs grantmaking, rather than brand association.
I disagree-voted because I feel like Iâve done much more than 100-hours of reading on AI Policy (including finishing the AI Safety Fundamentals Governance course) and still have a strong sense thereâs a lot I donât know, and regularly come across new work that I find insightful. Very possibly Iâm prioritising reading the wrong things (and would really value a reading list!) but thought Iâd share my experience as a data point.
I think itâs a solid improvement! I only occasionally browsed the previous version, but I remember it being a bit tricky to find the headline figures I was interested after listening to them cited on podcasts, whereas now going to https://ââepochai.org/ââtrends they seem all quite easy to find (plus dig into the details of) due to the intuitive/ââelegant layout.
We did!
Our team put a lot of thought into the job description which highlights the essential and desirable skills we were looking for. Each test was written with these criteria in mind, and we also used them to help reviewers score responses.[1] This helped reviewers provide scores more consistently and purposefully. Just to avoid overstating things though, Iâd add that we werenât just trying to legalistically make sure every question had a neat correspondence to previously written criteria, but instead were thinking âis this representative of the type of work the role involves?â
- ^
This is probably a bit more in the weeds than necessary, but though the initial application questions were written with clear reference to essential/âdesirable skills in the job description, I didnât convert that into a clear grading rubric for reviewers to use. This was just an oversight.
- ^
We did correspond via email, but yes thatâs rightâwe didnât have a video call with any candidates until the work trial.
I think thereâs a case to have had a call before then, as suggested by one of the candidates that gave us feedback:
One helpful suggestion they offered us was running a Q&A session with each candidate just before the work trial. This could have been an opportunity to more casually meet with them, and discuss any concerns they might have about the work trial.
The reason itâs non-obvious to me whether that would have been worthwhile is that it would have lengthened the process (in our case, due to the timing of leave commitments, the delay would have been considerable).
Yes, we are aiming to publish this next week, and it should include an explanation on the delay. (Also thanks for checking in on thisâthe accountability is helpful.)
I donât have any particularly strong views, and would be interested in what others think.
Broadly, I feel like I agree that more specificity/âtransparency is helpful, though I donât feel convinced that itâs not also worth asking at some stage in the application an open-ended question like âWhy are you interested in the role?â. Not sure I can explain/âdefend my intuitions here much right now but I would like to think more on it when I get around to writing some reflections on the Research Communicator hiring process.
Iâm not sure I follow what you mean by transparency in this context. Do you mean being more transparent about what exactly we were looking for? In our case we asked for <100 words on âWhy are you interested in this role?â and âBriefly, what is your experience with effective giving and/âor effective altruism?â and we were just interested in seeing if applicantsâ interest/âexperienced aligned with the skills, traits and experience we listed in the job descriptions.
In the hiring round I mentioned, we did time submissions for the work tests, and at least my impression is we found a way of doing so worked out fairly well. Having a timed component for the initial application is also possible, but might require more of an âhonour codeâ system as setting up a process that allows for verification of the time spent is a pretty a big investment for the first stage of an application.
As a former applicant for many EA org roles, I strongly agree! I recall spending on average 2-8 times longer on some initial applications than was estimated by many job ads.
As someone who just helped drive a hiring process for Giving What We Can (for a Research Communicator role) I feel a bit daft having experienced it on the other side, but not having learned from it. I/âwe did not do a good enough job here. We had a few initial questions that we estimated would take ~20-60 minutes, and in retrospect I now imagine many candidates would have spent much longer than this (I know I would have).
Over the coming month or so Iâm hoping to draft a post with reflections on what we learned from this, and how we would do better next time (inspired by Aaron Gertlerâs 2020 post on hiring a copyeditor for CEA). Iâll be sure to include this comment and its suggestion (having a link at the end of the application form where people can report how long it actually took to fill the form in) in that post.
Thanks for this post! I appreciate your writing, and also appreciated including images in your postâit made it more fun to read.
I wrote some feedback privately which the author thought would be good to share publicly, so this is a lightly edited version of that feedback:
The post was quite long, taking 10-15 minutes or so for me to read. I think this was because you wrote this in quite a careful way, including caveats, counterarguments, etc., and Iâm not sure all this was necessary.
I think a shorter (~1/â3rd the length) post which just explained what convenience meant using a few examples could have been better. In particular, it would be useful to emphasise examples where existing terminology fails but where âconvenienceâ succeeds.
On that last point: I canât immediately think of an example where âconvenienceâ would be helpful (except for times I would already use the word âconvenienceâ) and so I donât feel sold on the term. I also think we should have a very high bar for adding jargon. In the examples you gave, I think I generally either: prefer the original sentence you included, would already use the term convenient (if it came to mind), or think thereâs a better way of conveying the same meaning using a different term.
To combine the few comments above: I think itâs difficult to decide which jargon will be helpful from the armchair. So I think rather than a carefully made argument for the uptake of a particular term, I think itâs better to just define the term and put it out there (with a few examples) -- if itâs useful enough, people will use it; if not, it probably wonât catch on (and I donât think a careful argument would have made the difference).
I found the convenience accounting part quite confusing. Specifically, I donât get how the concept of convenience helps do this kind of accounting, and (as I think you seem to believe based on your âaccountant foolishly trying to list...â) I donât think this accounting is actually helpful for most decisions.
I really like the general concept of trying to keep track of what is and is not convenient to you, your organisations, others around you, etc. I appreciated you giving such honest examples of your own conveniences. Iâm not sure you needed the term to do this, but I do think itâs good practice.
Thanks for conducting this impact assessment, for sharing this draft with us before publishing it, and for your help with GWWCâs own impact evaluation! A few high-level comments (as a researcher at GWWC):
First, just reiterating that we appreciate others checking our assumptions and sharing their views on them.
As other commenters have discussed, we donât think it makes sense to only account for our influence on longtermist donations. Weâd like to do a better job explaining our views here, which we see as similar to Open Philanthropyâs worldview diversification.
I also appreciate your acknowledgements of the limitations of your approach (some of which are similar to ours) in that you have not modelled our potential indirect benefitsâwhich may well be the driver of our impact.
Regarding the difference between how you have modelled the value of the GWWC Pledge versus how we did so:
As a quick-summary for others: the key difference is that GWWCâs impact evaluation worked out the value of the pledge by looking at GWWC Pledgers as an overall cohort, and looking at the average amount donated by Pledgers each year, over their Pledge tenure. The analysis in this evaluation (explained in the post) looks at Pledgers as individuals and models them each in turn, and takes the average of those models. (Please correct me if Iâm wrong here!).
Consequently, this approach uses a âricherâ set of information, though I also see it as requiring more assumptions (that the rules for extrapolating each individual Pledgersâ giving are in fact correct). Whereas our approach uses less information, but only assumes thatâon averageâpast data will be indicative of future data. Iâd be interested in whether you think this is a fair summary.
I have some intuitions that GWWCâs approach is more robust, but that this oneâif done wellâcould potentially be more valid. Theyâre just intuitions though, and I havenât thought too deeply about it.
I find it interesting that this approach appears to lead to more optimistic conclusions abut GWWCâs impact (despite the way it âboundsâ how any individual Pledgersâ giving can be extrapolated over time).
Thanks again for your work!
Hi Michael, thank you for the response
No problem!
Regarding:Also, wouldnât the above âx-risk discount rateâ be 2% rather than 0.2%?
There was a typo in my answer before: (1 - ((1 â 1â6)^(1/â100)) = 0.0018) which is ~0.2% (not 0.2), and is a fair amount smaller than the discount rate we actually used (3.5%). Still, if you assigned a greater probability of existential risk this century than Ord does, you could end up with a (potentially much) higher discount rate. Alternatively, even with a high existential risk estimate, if you thought we were going to find more and more cost-effective giving opportunities as time goes on, then at least for the purpose of our impact evaluation, these effects could cancel out.
I think if we spent more time trying to come to an all-things-considered view on this topic, weâd still be left with considerable uncertainty, and so I think it was the right call for us to just acknowledge to take the pragmatic approach of deferring to the Green Book.
In terms of the general tension between potentially high x-risk and the chance of transformative AI, I can only speak personally (not on behalf of GWWC). Itâs something on my mind, but itâs unclear to me what exactly the tension is. I still think itâs great to move money to effective charities across a range of impactful causes, and Iâm excited about building a culture of giving significantly and effectively throughout oneâs life (i.e., via the Pledge). I donât think GWWC should pivot and become specifically focused on one cause (e.g., AI) and otherwise Iâm not sure exactly what the potential for transformative AI should imply for GWWC.
Hi Phib, Michael from the GWWC Research team here! In our latest impact evaluation we did need to consider how to think about future donations. We explain how we did this in the appendix âOur approach to discount ratesâ. Essentially, itâs a really complex topic, and youâre right that existential risk plays into it (we note this as one of the key considerations). If you discount the future just based on Ordâs existential risk estimates, based on some quick-maths, the 1 in 6 chance over 100 years should discount each year by 0.2% (1 - ((1 â 1â6)^(1/â100)) = 0.02).
Yet there are many other considerations that also weigh into this, at least from GWWCâs perspective. Most significantly is how we should expect the cost-effectiveness of charities to change over time.
We chose to use a discount rate of 3.5% for our best-guess estimates (and 5% for our conservative estimates); based on the recommendation from the UK governmentâs green book. We explain why we made that decision in our report. It was largely motivated by our framework of being useful/âtransparent/âjustifiable over being academically correct and thorough.
If youâre interested in this topic, and on how to think about discount rates in general, you may find Founders Pledgeâs report on investing to give an interesting read.
Hi Joel â great questions!
(1) Are non-reporters counted as giving $0?
Yes â at least for recorded donations (i.e., the donations that are within our database). For example, in cell C41 of our working sheet, we provide the average recorded donations of a GWWC Pledger in 2022-USD ($4,132), and this average assumes non-reporters are giving $0. Similarly, in our âpledge statisticsâ sheet, which provides the average amount we record being given per Pledger per cohort, and by year, we also assumed non-reporters are giving $0.
(2) Does this mean we are underestimating the amount given by Pledgers?
Only for recorded donations â we also tried to account for donations made but that are not in our records. We discuss this more here âbut in sum, for our best guess estimates, we estimated that our records only account for 79% of all pledge donations, and therefore we need to make an upwards adjustment of 1.27 to go from recorded donations to all donations made. We discuss how we arrived at this estimate pretty extensively in our appendix (with our methodology here being similar to how we analysed our counterfactual influence). For our conservative estimates, we did not make any recording adjustments, and we think this does underestimate the amount given by Pledgers.
(3) How did we handle nonresponse bias and could we handle it better?
When estimating our counterfactual influence, we explicitly accounted for nonresponse bias. To do so, we treated respondents and nonrespondents separately, assuming a fraction of influence on nonrespondents compared to respondents for all surveys:50% for our best-guess estimates.
25% for our conservative estimates.
We actually did consider adjusting this fraction depending on the survey we were looking at, and in our appendix we explain why we chose not to in each case. Could we handle this better? Definitely! I really appreciate your suggestions here â we explicitly outline handling nonresponse bias as one of the ways we would like to improve future evaluations.
(4) Could we incorporate population base rates of giving when considering our counterfactual influence?
Iâd love to hear more about this suggestion, itâs not obvious to me how we could do this. For example, one interpretation here would be to look at how much Pledgers are giving compared to the population base rate. Presumably, weâd find they are giving more. But Iâm not sure how we could use that to inform our counterfactual influence, because there are at least two competing explanations for why they are giving more:One explanation is that we are simply causing them to give more (so we should increase our estimated counterfactual influence).
Another is that we are just selecting for people who are already giving a lot more than the average population (in which case, we shouldnât increase our estimated counterfactual influence).
But perhaps Iâm missing the mark here, and this kind of reasoning/âanalysis is not really what you were thinking of. As I said, would love to hear more on this idea.
(Also, appreciate your kind words on the thoroughness/ârobustness)
Thanks :)!
You can see in the donations by cause area a breakdown of the causes pledge and non-pledge donors give to. This could potentially inform a multiplier for the particular cause areas. I donât think we considered doing this, and am not sure itâs something weâll do in future, but weâd be happy to see othersâ do this using the information we provide.Unfortunately, we donât have a strong sense in how we influenced which causes donors gave to; the only thing that comes to mind is our question: âPlease list your best guess of up to three organisations you likely would *not* have donated to if Giving What We Can, or its donation platform, did not exist (i.e. donations where you think GWWC has affected your decision)â the results of which you can find on page 19 of our survey documentation here. Only an extremely small sample of non-pledge donors responded to the question, though. Getting a better sense of our influence here, as well as generally analysing trends in which cause areas our donors give to, is something weâd like to explore in our future impact evaluations.
Ah, I can see what you mean regarding our text, I assume in this passage:
We want to emphasise that this data surprised us and caused us to reevaluate a key assumption we had when we began our impact evaluation. Specifically, we went into this impact evaluation expecting to see some kind of decay per year of giving. In our 2015 impact evaluation, we assumed a decay of 5% (and even this was criticised for seeming optimistic compared to EA Survey data â a criticism we agreed with at the time). Yet, what we in fact seem to be seeing is an increase in average giving per year since taking the Pledge, even when adjusting for inflation.
What you say is right: we agree there seems to be a decay in fulfilment /â reporting rates (which is what the earlier attrition discussion was mostly about) but we just add the additional observation that giving increasing over time makes up for this.
There is a sense in which we do disagree with that earlier discussion, which is that we think the kind of decay that would be relevant to modelling the value of the Pledge is the decay in average giving over time, and at least here, we do not see a decay. But we couldâve been clearer about this; at least on my reading, I think the paragraph I quoted above conflates different sorts of âdecayâ.
Really appreciate this analysis, Jeff.
Point taken that there is no clear plateau at 30% -- itâll be interesting to see what future data shows.Part of the reason for us having less analysis on the change of reporting rates over time is that we did not directly incorporate this rate of change into our model. For example, the table of reporting rates was primarily used in our evaluation to test a hypothesis for why we see an increase in average giving (even assuming people are not reporting are not giving at all). Our model does not assume reporting rates donât decline, nor does it assume the decline in reporting rates plateaus.
Instead, we investigated how average giving (which is a product of both reporting rates, and the average amount given conditional on reporting) changes over time. We saw that the decline in reporting rates is (more than) compensated by the increase in giving conditional on reporting. It could be that this will no longer remain true beyond a certain time horizon (though, perhaps it will!), but there are other arguably conservative assumptions for these long time-horizons (e.g., that giving stops at pension age, doesnât include any legacy giving). Some of these considerations come up as we discuss why we did not assume a decay in our influence and in our limitations of our Pledge model (in the bottom of this section, right above this one).
On your final point:
Separately, I think it would be pretty reasonable to drop the pre-2011 reporting data. I think this probably represents something weird about starting up, like not collecting data thoroughly at first, and not about user behavior? I havenât done this in my analysis above, though, because since Iâm weighting by cohort size it doesnât do very much.
Do you mean excluding it just for the purpose of analysing reporting rates over time? If so, that could well be right, and if we investigate this directly in future impact evaluations weâll need to look into what the quality/ârelevance of that data was and make a call here.
Thanks for your questions Jeff!
To answer point by point:
How does [the evaluationâs finding that Pledgers seem to be giving more on average each year after taking the Pledge] handle members who arenât reporting any donations?
The (tentative) finding that Pledgersâ giving increases more each year after taking the Pledge assumes that members who arenât reporting any donations are not donating.
How does reporting rate vary by tenure?
We include a table âProportion of GWWC Pledgers who record any donations by Pledge year (per cohort)â on page 48. In sum: reporting declines in the years after the Pledge, but that decline seems to plateau at a reporting rate of ~30% .
Was the $7,619 the average among [the 250-person sample we used for GWWC reporting accuracy survey] who recorded any donations, or counting ones who didnât record donations as having donated $0? What fraction of members in the 250-person sample recorded any donations?
The $7,619 figure is the average if you count those as not recording a donation as having donated $0. Unfortunately, I donât have the fraction of the 250-person sample who recorded donations at all on hand. However, I can give an informed guess: the sample was a randomly selected group of people who had taken the GWWC Pledge before 2021, and eyeballing the table I linked above, ~40-50% of pre-2021 Pledgers record a donation each year.
Where does the decline in the proportion of people giving fit into the model?
The model does not directly incorporate the decrease in proportion of people recording/âgiving, and neither does it directly incorporate the increase in the donation sizes for people who record/âgive. The motivation here is that â at least in the data so far â we see these effects cancel out (indeed, we see that the increase in donation size slightly outweighs the decrease in recording rates â but weâre not sure that trend will persist). We go into much more depth on this in our appendix section âWhy we did not assume a decay in the average amount given per yearâ.
Hi Vasco, thanks for your questions!
Iâll answer what I see as the core of your questions before providing some quick responses to each individually.
As you suggest, our approach is very similar to Open Philanthropyâs worldview diversification. One way of looking at it is, we want to provide donation recommendations that maximise cost-effectiveness from the perspective of a particular worldview. We think it makes sense to add another constraint to this, which is that we prioritise providing advice to more plausible worldviews that are consistent with our approach (i.e., focusing on outcomes, having a degree of impartiality, and wanting to rely on evidence and reason).
Iâll share how this works with an example. The âglobal health and wellbeingâ cause area contains recommendations that appeal to people with (some combination) of the following beliefs:
We should prioritise helping people over animals
Some scepticism about highly theoretical theories of change, and a preference for donating to charities whose impact is supported by evidence
Itâs very valuable to save a life
Itâs very valuable to improve someoneâs incomes
People may donate to the cause area without all of these beliefs, or with some combination, or perhaps with none of them but with another motivation not included. Perhaps they have more granular beliefs on top of these, which means they might only be interested in a subset of the fund (e.g., focusing on charities that improve lives rather than save them).
Many of your questions seem to be suggesting that, when we account for consumption of animal products, (3) and (4) are not so plausible. I suspect that this is among the strongest critiques for worldviews that would support GHW. I have my own views about it (as would my colleagues), but from a âGWWCâ perspective, we donât feel confident enough in this argument to use it as a basis to not support this kind of work. In other words, we think the worldviews that would want to give to GHW are sufficiently plausible.
I acknowledge thereâs a question-begging element to this response: I take it your point is why is it sufficiently plausible, and who decides this? Unfortunately, we can acknowledge that we donât have a strong justification here. Itâs a subjective judgement formed by the research team, informed by existing cause prioritisation work from other organisations. We donât feel well-placed to do this work directly (for much the same reason as we need to evaluate evaluators rather than doing charity evaluation ourselves). We would be open to investigating these questions further by speaking with organisations engaging in this cause-prioritisation â weâd love to have a more thoughtful and justified approach to cause prioritisation. In other words, I think youâre pushing on the right place (and hence this answer isnât particularly satisfying).
More generally, weâre all too aware that there are only two of us working directly to decide our recommendation and are reluctant to use our own personal worldviews in highly contested areas to determine our recommendations. Of course, it has to happen to some degree (and we aim to be transparent about it). For example, if I were to donate today, I would likely give 100% of my donations to our Risks and Resilience Fund. I have my reasons, and think Iâm making the right decision according to my own views, but Iâm aware others would disagree with me, and in my role I need to make decisions about our recommendations through the lens of commonly held worldviews I disagree with.
Iâll now go through your questions individually:
Weâd likely suggest donating to our cause area funds, via the âall cause bundleâ, splitting their allocations equally between the three areas. This is our default âset-and-forgetâ option, that seems compelling from a perspective of wanting to give a fraction of oneâs giving to causes that are maximally effective from particular worldviews. This is not the optimal allocation of moral uncertainty (on this approach, the different worldviews could âtradeâ and increase their combined impact); we havenât prioritised trying to find such an optimised portfolio for this purpose. Itâd be an interesting project, and would encourage anyone to do this and share it on the Forum and with us!
We are not confident. This is going to depend on how you value animals compared to humans; weâre also not sure exactly how cost-effective the AWF Fund is (just that it is the best option we know of in a cause area we think is generally important, tractable and neglected).
If we thought there wasnât a sufficiently plausible worldview whereby TCF was the best option we knew of, we would not recommend it.
We did not consider this, and so do not have a considered answer. I think this would be something we would be interested in considering in our next investigation.
As above, we would if we didnât think there was a sufficiently strong worldview by which TCF was the best option we knew of. This could be because of a combination of the meat eater problem, and that we think itâs just not plausible to discount animals. Itâs an interesting question, but itâs also one where Iâm not sure our comparative advantage is coming to a view on it (though perhaps, just as we did with the view that GW should focus on economic progress, we could still discuss it in our evaluation).