Currently Research Director at Founders Pledge, but posts and comments represent my own opinions, not FP’s, unless otherwise noted.
I worked previously as a data scientist and as a journalist.
Currently Research Director at Founders Pledge, but posts and comments represent my own opinions, not FP’s, unless otherwise noted.
I worked previously as a data scientist and as a journalist.
Hey Darren, thanks for doing this AMA — and thanks for doing your part to steer money to such a valuably and critically important cause.
Can you describe a bit about the decision-making process at Beast Philanthropy? More to the point, what would an optimal decision-making process look like, in your view? e.g. how would you use research, how would you balance giving locally vs globally, think about doing the most good possible (or constraining that in some way), etc?
I listened to the whole episode — if I understood correctly, they are mostly skeptical that there are effects at very low blood lead levels. At the end of the podcast, Stuart or Tom (can’t remember which) explicitly says that they’re not skeptical that lead affects IQ, and they spend most of the episode addressing the claimed relationship at low BLLs (rather than the high ones addressed by LEEP, CGD, other interventions).
I’d be interested in exploring funding this and the broader question of ensuring funding stability and security robustness for critical OS infrastructure. @Peter Wildeford is this something you guys are considering looking at?
I’m also strongly interested in this research topic — note that although the problem is worst in the U.S., the availability and affordability of fentanyl (which appears to be driving OD deaths) suggests that this could easily spread to LMICs in the medium-term, suggesting that preventive measures such as vaccines could even be cost-effective by traditional metrics.
Easily reconciled — most of our money moved is via advising our members. These grants are in large part not public, and members also grant to many organizations that they choose irrespective of our recommendations. We provide the infrastructure to enable this.
The Funds are a relatively recent development, and indeed some of the grants listed on the current Fund pages were actually advised by the fund managers, not granted directly from money contributed to the Fund (this is noted on the website if it’s the case for each grant). Ideally, we’d be able to grow the Funds a lot more so that we can do much more active grantmaking, and at the same time continue to advise members on effective giving.
My team (11 people at the moment) does generalist research across worldviews — animal welfare, longtermism/GCRs, and global health and development. We also have a climate vertical, as you note, which I characterize in more detail in this previous forum comment.
EDIT:
Realized I didn’t address your final question. I think we are a mix, basically — we are enabling successful entrepreneurs to give, period (in fact, we are committing them to do so via a legally binding pledge), and we are trying to influence as much of their giving as possible toward the most effective possible things. It is probably more accurate to represent FP as having a research arm, simply given staff proportions, but equally accurate to describe our recommendations as being “research-driven.”
We (Founders Pledge) do have a significant presence in SF, and are actively trying to grow much faster in the U.S. in 2024.
A couple weakly held takes here, based on my experience:
Although it’s true that issues around effective giving are much more salient in the Bay Area, it’s also the case that effective giving is nearly as much of an uphill battle with SF philanthropists as with others. People do still have pet causes, and there are many particularities about the U.S. philanthropic ecosystem that sometimes push against individuals’ willingness to take the main points of effective giving on board.
Relatedly, growing in SF seems in part to be hard essentially because of competition. There’s a lot of money and philanthropic intent, and a fair number of existing organizations (and philanthropic advisors, etc) that are focused on capturing that money and guiding that philanthropy. So we do face the challenge of getting in front of people, getting enough of their time, etc.
Since FP has historically offered mostly free services to members, growing our network in SF is something we actually need to fundraise for. On the margin I believe it’s worthwhile, given the large number of potentially aligned UHNWs, but it’s the kind of investment (in this case, in Founders Pledge by its funders) that would likely take a couple years to bear fruit in terms of increased amounts of giving to effective charities. I expect this is also a consideration for other existing groups that are thinking about raising money for a Bay Area expansion.
I think your arguments do suggest good reasons why nuclear risk might be prioritized lower; since we operate on the most effective margin, as you note, it is also possible at the same time for there to be significant funding margins in nuclear that are highly effective in expectation.
My point is precisely that you should not assume any view. My position is that the uncertainties here are significant enough to warrant some attention to nuclear war as a potential extinction risk, rather than to simply bat away these concerns on first principles and questionable empirics.
Where extinction risk is concerned, it is potentially very costly to conclude on little evidence that something is not an extinction risk. We do need to prioritize, so I would not for instance propose treating bad zoning laws as an X-risk simply because we can’t demonstrate conclusively that they won’t lead to extinction. Luckily there are very few things that could kill very large numbers of people, and nuclear war is one of them.
I don’t think my argument says anything about how nuclear risk should be prioritized relative to other X-risks, I think the arguments for deprioritizing it relative to others are strong and reasonable people can disagree; YMMV.
If you leave 1,000 − 10,000 humans alive, the longterm future is probably fine
This is a very common claim that I think needs to be defended somewhat more robustly instead of simply assumed. If we have one strength as a community, is in not simply assuming things.
My read is that the evidence here is quite limited, the outside view suggests that losing 99.9999% of a species / having a very small population is a significant extinction risk, and that the uncertainty around the long-term viability of collapse scenarios is enough reason to want to avoid near-extinction events.
Has there been any formal probabilistic risk assessment on AI X-risk? e.g. fault tree analysis or event tree analysis — anything of that sort?
I disagree with the valence of the comment, but think it reflects legitimate concerns.
I am not worried that “HLI’s institutional agenda corrupts its ability to conduct fair-minded and even-handed assessment.” I agree that there are some ways that HLI’s pro-SWB-measurement stance can bleed into overly optimistic analytic choices, but we are not simply taking analyses by our research partners on faith and I hope no one else is either. Indeed, the very reason HLI’s mistakes are obvious is that they have been transparent and responsive to criticism.
We disagree with HLI about SM’s rating — we use HLI’s work as a starting point and arrive at an undiscounted rating of 5-6x; subjective discounts place it between 1-2x, which squares with GiveWell’s analysis. But our analysis was facilitated significantly by HLI’s work, which remains useful despite its flaws.
I guess I would very slightly adjust my sense of HLI, but I wouldn’t really think of this as an “error.” I don’t significantly adjust my view of GiveWell when they delist a charity based on new information.
I think if the RCT downgrades StrongMinds’ work by a big factor, that won’t really introduce new information about HLI’s methodology/expertise. If you think there are methodological weaknesses that would cause them to overstate StrongMinds’ impact, those weaknesses should be visible now, irrespective of the RCT results.
I can also vouch for HLI. Per John Salter’s comment, I may also have been a little sus early (sorry Michael) on but HLI’s work has been extremely valuable for our own methodology improvements at Founders Pledge. The whole team is great, and I will second John’s comment to the effect that Joel’s expertise is really rare and that HLI seems to be the right home for it.
Just a note here as the author of that lobbying post you cite: the CEA including the 2.5% change in chance of success is intended to be illustrative — well, conservative, but it’s based on nothing more than a rough sense of effect magnitude from having read all those studies for the lit review. The specific figures included in the CEA are very rough. As Stephen Clare pointed out in the comments, it’s also probably not realistic to have modeled that is normal on the [0,5] 95% CI.
Hey Vasco, you make lots of good points here that are worth considering at length. These are topics we’ve discussed on and off in a fairly unstructured way on the research team at FP, and I’m afraid I’m not sure what’s next when it comes to tackling them. We don’t currently have a researcher dedicated to animal welfare, and our recommendations in that space have historically come from partner orgs.
Just as context, the reason for this is that FP has historically separated our recommendations into three “worldviews” (longtermism, current generations, and animal welfare). The idea is that it’s a lot easier to shift member grantmaking across causes within a worldview (e.g. from rare diseases to malaria, for instance) than across worldviews (e.g. to get people to care much more about chickens). The upshot of this, for better or for worse, is that we end up spending a lot of time prioritizing causes within worldviews, and avoiding the question of how to prioritize across worldviews.
This is also part of the reason we don’t have a dedicated animal welfare researcher — we haven’t historically moved as much money within that worldview as within our others. But it’s actually not sure which way the causality flows in that case, so your post is a good nudge to think more seriously about this, as well as the ways we might be able to incorporate animal welfare considerations into our GHD calculations, worldview separations notwithstanding.
Hey Matthew, thanks for sharing this. Can you provide some more information (or link to your thoughts elsewhere) on why fervor around UV-C is misplaced? As you know, ASHRAE Standards 185.1 and 185.2 concern testing of UV devices for germicidal irradiation, so I’d be particularly interested to know if this was an area that ASHRAE itself had concluded was unpromising.
I thought of some other down-the-line feature requests
Google Sheets integration (we currently already store our forecasts in a Google sheet)
Relatedly, ability to export to CSV (does this already exist and I just missed it?)
Ability to designate a particular resolver
Different formal resolution mechanisms, like a poll of users.
Ah, great! I think it would be nice to offer different aggregation options, though if you do offer one I agree that geo mean of odds is the best default. But I can imagine people wanting to use medians or averages, or even specifying their own aggregation functions. Especially if you are trying to encourage uptake by less technical organizations, it seems important to offer at least one option that is more legible to less numerate people.
I have already installed this and started using this at Founders Pledge. Thanks for making this! I’ve been wanting something like this for a long time.
Some feature requests:
Aggregation choices (e.g. geo mean of odds would be nice)
Brier scores for users
Calibration curves for users
FP Research Director here.
I think Aidan and the GWWC team did a very thorough job on their evaluation, and in some respects I think the report serves a valuable function in pushing us towards various kinds of process improvements.
I also understand why GWWC came to the decision they did: to not recommend GHDF as competitive with GiveWell. But I’m also skeptical that any organization other than GiveWell could pass this bar in GHD, since it seems that in the context of the evaluation GiveWell constitutes not just a benchmark for point-estimate CEAs but also a benchmark for various kinds of evaluation practices and levels of certainty.
I think this comes through in three key differences in perspective:
Can a grant only be identified as cost-effective in expectation if lots of time is spent making an unbiased, precise estimate of its cost-effectiveness?
Should CEAs be the singular determinant of whether or not a grant gets made?
Is maximizing calculated EV in the case of each individual grant the best way to ensure cost-effectiveness over the span of an entire grantmaking programme?
My claim is that, although I’m fairly sure sure GWWC would not explicitly say “yes” to each of these questions, the implication of their approach suggests otherwise. FP, meanwhile, thinks the answer to each is clearly “no.” I should say that GWWC has been quite open in saying that they think GHDF could pass the bar or might even pass it today — but I share other commenters’ skepticism that this could be true by GWWC’s lights in the context of the report! Obviously, though, we at FP think the GHDF is >10x.
The GHDF is risk-neutral. Consequently, we think that spending time reducing uncertainty about small grants is not worthwhile: it trades off against time that could be spent evaluating and making more plausibly high-EV grants. As Rosie notes in her comment, a principal function of the GHDF has been to provide urgent stopgap funding to organizations that quite often end up actually receiving funding from GW. Spending GW-tier effort getting more certain about $50k-$200k grants literally means that we don’t spend that time evaluating new high-EV opportunities. If these organizations die or fail to grow quickly, we miss out on potentially huge upside of the kind that we see in other orgs of which FP has been an early supporter. Rosie lists several such organizations in her comment.
The time and effort that we don’t spend matching GiveWell’s time expenditure results in higher variance around our EV estimates, and one component of that variance is indeed human error. We should reduce that error rate — but the existence of mistakes isn’t prima facie evidence of lack of rigor. In our view, the rigor lies in optimizing our processes to maximize EV over the long-term. This is why we have, for instance, guidelines for time expenditure based on the counterfactual value of researcher time. This programme entails some tolerance for error. I don’t think this is special pleading: you can look at GHDF’s list of grantees and find a good number that we identified as cost-effective before having that analysis corroborated by later analysis from GiveWell or other donors. This historical giving record, in combination with GWWC’s analysis, is what I think prospective GHDF donors should use to decide whether or not to give to the Fund.
Finally—a common (and IMO reasonable) criticism of EA-aligned or EA-adjacent organizations is an undue focus on quantification: “looking under the lamppost.” We want to avoid this without becoming detached from the base numeric truth, so one particular way we want to avoid it is by allowing difficult-to-quantify considerations to tilt us toward or away from a prospective grant. We do CEAs in nearly every case, but for the GHDF they serve an indicative purpose (as they often do at e.g. Open Phil) rather than a determinative one (as they often do at e.g. GiveWell). Non-quantitative considerations are elaborated and assessed in our internal recommendation template, which GWWC had access to but which I feel they somewhat underweighted in their analysis. These kinds of considerations find their way into our CEAs as well, particularly in the form of subjective inputs that GWWC, for their part, found unjustified.