I do think there are things worth funding for which evidence doesn’t exist. The initial RNA vaccine research relied on good judgement around a hypothetical, and had a hard time getting funding for lack of evidence. It ended up being critical to saving millions of lives.
I think there are more ways some sort of evidence can be included in grant making. But the core of the criticism is about judgement, and I think a $100k grant for 6 months of video game developers time, or $50k grants to university student group organizers represent poor judgement (EAIF and LTFF grants). These grants have caused reputational harm to the movement, and that should have been easy to foresee. What has been the hit to fundraising for EA global health and animal welfare causes from the fallout from bad longtermism bets (FTX/SBF included)?
On the rationalization. Perhaps it isn’t a post-hoc rationalization, more of an excuse. It is saying “the funding bar was low, but we still think the expected value of the video game is more important than 25 lives”. That’s pretty crass. And probably worse than just the $100k counterfactual because of reputational spillover to other causes.
Presumably there’s some probability X of averting doom that you would consider more important than 25 statistical lives. I’d also guess that you’d agree that this is true for some rather low-but-nonPascalian probabilities. Eg, I predict that if you thought about the problem even briefly, you’d agree the above claim is true for X=0.001%, not just say 30%.
(To be clear I’m definitely not saying that the grant’s effect size is >0.001% in expectation).
So then the real disagreement is either a) What X ought to be (where I presume you have a higher number than LTFF), or b) whether the game is above X.[1]
Stated more clearly, I think your disagreement with the grant is “merely” a practical disagreement about effect sizes. Whereas your language here, if taken literally, is not actually sensitive to the effect size.
(My own guess is that the grant was not above the 2022 LTFF bar, but that’s an entirely different line of reasoning). And of course implicitly I believe the 2022 LTFF bar was above the 2022 GiveWell bar by my lights.
A butterfly flaps its wings and causes a devastating hurricane to form in the tropics. Therefore, we must exterminate butterflies, because there is some small probability X that doing so will avert hurricane disaster.
But it is just as easily the case that the butterfly flaps prevent devastating hurricanes from forming. Therefore we must massively grown their population.
The point being, it can be practically impossible to understand the casual tree and get even the sign right around low probability events.
That’s what I take issue with—it’s not just the numbers, it’s the structural uncertainty of cause and effect chains when you consider really low probability events. Expected value is a pretty bad tool for action relevant decision making when you are dealing with such numerical and structural uncertainty. It’s perhaps better to pick a framework like “it’s robust under multiple decision theories” or “pick something that has the least downside risk”.
In our instance, two competing plausible structural theories among many are something like:
“game teaches someone an AI safety concept → makes them more knowledgeable or inspire them to take action → they work on AI safety → solve alignment problem → future saved”
vs.
“people get interested in doing the most good → sees community of people that claim to do that, but that they fund rich people to make video games → causes widespread distrust of the movement → strong social stigma developed against people that care about AI risk → greatly narrowed range of people / worldviews because people don’t want to associate → makes it near impossible to solve alignment problem → future destroyed”
The justifications for these grants tend to use some simple expected value calculation of a singular rosy hypothetical casual chain. The problem is it’s possible to construct a hypothetical value chain to justify any sort of grant. So you have to do more than just make a rosy casual chain and multiply numbers through. I’ve commented before on some pretty bad ones that don’t pass the laugh test among domain experts in the climate and air quality space.
The key lesson from early EA (evidenced based giving in global health) was that it is really hard to understand if the thing you are doing is having an impact, and what the valence of the impact is, for even short, measurable casual chains. EA’s popular causes now (longtermism) seem to jettison that lesson, when it is even more unclear what the impact and sign is through complicated low probability casual chains.
The justifications for these grants tend to use some simple expected value calculation of a singular rosy hypothetical casual chain. The problem is it’s possible to construct a hypothetical value chain to justify any sort of grant. So you have to do more than just make a rosy casual chain and multiply numbers through.
Our approach to making such comparisons strikes some as highly counterintuitive, and noticeably different from that of other “prioritization” projects such as Copenhagen Consensus. Rather than focusing on a single metric that all “good accomplished” can be converted into (an approach that has obvious advantages when one’s goal is to maximize), we tend to rate options based on a variety of criteria using something somewhat closer to (while distinct from) a “1=poor, 5=excellent” scale, and prioritize options that score well on multiple criteria.
We often take approaches that effectively limit the weight carried by any one criterion, even though, in theory, strong enough performance on an important enough dimension ought to be able to offset any amount of weakness on other dimensions.
… I think the cost-effectiveness analysis we’ve done of top charities has probably added more value in terms of “causing us to reflect on our views, clarify our views and debate our views, thereby highlighting new key questions” than in terms of “marking some top charities as more cost-effective than others.”
I mean, there are pretty good theoretical reasons for thinking that anything that’s genuinely positive for longtermism has higher EV than anything that isn’t? Not really sure what’s gained by calling the view “crass”. (The wording may be, but you came up with the wording yourself!)
It sounds like you’re just opposed to strong longtermism. Which is fine, many people are. But then it’s weird to ask questions like, “Can’t we all agree that GiveWell is better than very speculative longtermist stuff?” Like, no, obviously strong longtermists are not going to agree with that! Read the paper if you really don’t understand why.
These grants have caused reputational harm to the movement, and that should have been easy to foresee. What has been the hit to fundraising for EA global health and animal welfare causes from the fallout from bad longtermism bets (FTX/SBF included)?
I really don’t think it’s fair to conflate speculative-but-inherently-innocent “bets” of this sort with SBF’s fraud. The latter sort of norm-breaking is positively threatening to others—an outright moral violation, as commonly understood. But the “reputational harm” of simply doing things that seem weird or insufficiently well-motivated to others seems very different to me, and probably not worth going to extremes to avoid (or else you can’t do anything that doesn’t sufficiently appeal to normies).
Perhaps another way to put it is that even longtermists have obvious reasons to oppose SBF’s fraud (my post that you linked to suggested that it was negative-EV for longtermist goals). But I think strong longtermists should generally feel perfectly comfortable defending speculative grants that are positive-EV and the only “risk” is that others don’t judge them so positively. People are allowed to make different judgments (as long as they don’t harm anyone). Let a thousand flowers bloom, and all that.
Insofar as your real message is, “Stop doing stuff that looks weird, even if it is perfectly defensible by longtermist lights, simply because I have neartermist values and disagree with it,” then that just doesn’t actually seem like a reasonable ask?
I think that longtermism relies on more popular, evidenced-based causes like global health and animal welfare to do its reputational laundering through the EA label. I don’t see any benefit to global health and animal welfare causes from longtermism. And for that reason I think it would be better for the movement to split into “effective altruism” and “speculative altruism” so the more robust global health and animal welfare causes areas don’t have to suffer the reputational risk and criticism that is almost entirely directed at the longtermism wing.
Given the movement is essentially driven by Open Philanthropy, and they aren’t going to split, I don’t see such a large movement split happening. So I may be inclined towards some version of, as you say, “Stop doing stuff that looks weird, even if it is perfectly defensible by longtermist lights, simply because I have neartermist values and disagree with it.” The longtermist stuff is maybe like 20% of funding and 80% of reputational risk, and the most important longtermist concerns can be handled without the really weird speculative stuff.
But that’s irrelevant, because I think this ought to be a pretty clear case of the grant not being defensible by longtermist standards. Paying bay area software development salaries to develop a video game (why not a cheap developer literally anywhere else?) that didn’t even get published is hardly defensible. I get that the whole purpose of the fund is to do “hits based giving”. But it’s created an environment where nothing can be a mistake, because it is expected most things would fail. And if nothing is a mistake, how can the fund learn from mistakes?
Ok, so it sounds like your comparisons with GiveWell were an irrelevant distraction, given that you understand the point of “hits based giving”. Instead, your real question is: “why not [hire] a cheap developer literally anywhere else?”
I’m guessing the literal answer to that question is that no such cheaper developer applied for funding in the same round with an equivalent project. But we might expand upon your question: should a fund like LTFF, rather than just picking from among the proposals that come to them, try taking some of the ideas from those proposals and finding different (perhaps cheaper) PIs to develop them?
It’s possible that a more active role in developing promising longtermist projects would be a good use of their time. But I don’t find it entirely obvious the way that you seem to. A few thoughts that immediately spring to mind:
(i) My sense of that time period was that finding grantmakers was itself a major bottleneck, and given that longtermism seemed more talent-constrained than money-constrained at that time, having key people spend more time just to save some money presumably would not have seemed a wise tradeoff.
(ii) A software developer that comes to you with an idea presumably has a deeper understanding of it, and so could be expected to do a better job of it, than an external contractor to whom you have to communicate the idea. (That is, external contractors increase risk of project failure due to miscommunication or misunderstanding.)
(iii) Depending on the details, e.g. how specific the idea is, taking an idea from someone’s grant proposal to a cheaper PI might constitute intellectual theft. It certainly seems uncooperative / low-integrity, and not a good practice for grant-makers who want to encourage other high-skilled people with good ideas to apply to their fund!
I do think there are things worth funding for which evidence doesn’t exist. The initial RNA vaccine research relied on good judgement around a hypothetical, and had a hard time getting funding for lack of evidence. It ended up being critical to saving millions of lives.
I think there are more ways some sort of evidence can be included in grant making. But the core of the criticism is about judgement, and I think a $100k grant for 6 months of video game developers time, or $50k grants to university student group organizers represent poor judgement (EAIF and LTFF grants). These grants have caused reputational harm to the movement, and that should have been easy to foresee. What has been the hit to fundraising for EA global health and animal welfare causes from the fallout from bad longtermism bets (FTX/SBF included)?
On the rationalization. Perhaps it isn’t a post-hoc rationalization, more of an excuse. It is saying “the funding bar was low, but we still think the expected value of the video game is more important than 25 lives”. That’s pretty crass. And probably worse than just the $100k counterfactual because of reputational spillover to other causes.
Presumably there’s some probability X of averting doom that you would consider more important than 25 statistical lives. I’d also guess that you’d agree that this is true for some rather low-but-nonPascalian probabilities. Eg, I predict that if you thought about the problem even briefly, you’d agree the above claim is true for X=0.001%, not just say 30%.
(To be clear I’m definitely not saying that the grant’s effect size is >0.001% in expectation).
So then the real disagreement is either a) What X ought to be (where I presume you have a higher number than LTFF), or b) whether the game is above X.[1]
Stated more clearly, I think your disagreement with the grant is “merely” a practical disagreement about effect sizes. Whereas your language here, if taken literally, is not actually sensitive to the effect size.
(My own guess is that the grant was not above the 2022 LTFF bar, but that’s an entirely different line of reasoning). And of course implicitly I believe the 2022 LTFF bar was above the 2022 GiveWell bar by my lights.
A butterfly flaps its wings and causes a devastating hurricane to form in the tropics. Therefore, we must exterminate butterflies, because there is some small probability X that doing so will avert hurricane disaster.
But it is just as easily the case that the butterfly flaps prevent devastating hurricanes from forming. Therefore we must massively grown their population.
The point being, it can be practically impossible to understand the casual tree and get even the sign right around low probability events.
That’s what I take issue with—it’s not just the numbers, it’s the structural uncertainty of cause and effect chains when you consider really low probability events. Expected value is a pretty bad tool for action relevant decision making when you are dealing with such numerical and structural uncertainty. It’s perhaps better to pick a framework like “it’s robust under multiple decision theories” or “pick something that has the least downside risk”.
In our instance, two competing plausible structural theories among many are something like: “game teaches someone an AI safety concept → makes them more knowledgeable or inspire them to take action → they work on AI safety → solve alignment problem → future saved” vs. “people get interested in doing the most good → sees community of people that claim to do that, but that they fund rich people to make video games → causes widespread distrust of the movement → strong social stigma developed against people that care about AI risk → greatly narrowed range of people / worldviews because people don’t want to associate → makes it near impossible to solve alignment problem → future destroyed”
The justifications for these grants tend to use some simple expected value calculation of a singular rosy hypothetical casual chain. The problem is it’s possible to construct a hypothetical value chain to justify any sort of grant. So you have to do more than just make a rosy casual chain and multiply numbers through. I’ve commented before on some pretty bad ones that don’t pass the laugh test among domain experts in the climate and air quality space.
The key lesson from early EA (evidenced based giving in global health) was that it is really hard to understand if the thing you are doing is having an impact, and what the valence of the impact is, for even short, measurable casual chains. EA’s popular causes now (longtermism) seem to jettison that lesson, when it is even more unclear what the impact and sign is through complicated low probability casual chains.
So it’s about a lot more than effect sizes.
Worth noting that even GiveWell doesn’t rely on a single EV calculation either (however complex). Quoting Holden’s 10 year old writeup Sequence thinking vs. cluster thinking:
I mean, there are pretty good theoretical reasons for thinking that anything that’s genuinely positive for longtermism has higher EV than anything that isn’t? Not really sure what’s gained by calling the view “crass”. (The wording may be, but you came up with the wording yourself!)
It sounds like you’re just opposed to strong longtermism. Which is fine, many people are. But then it’s weird to ask questions like, “Can’t we all agree that GiveWell is better than very speculative longtermist stuff?” Like, no, obviously strong longtermists are not going to agree with that! Read the paper if you really don’t understand why.
I really don’t think it’s fair to conflate speculative-but-inherently-innocent “bets” of this sort with SBF’s fraud. The latter sort of norm-breaking is positively threatening to others—an outright moral violation, as commonly understood. But the “reputational harm” of simply doing things that seem weird or insufficiently well-motivated to others seems very different to me, and probably not worth going to extremes to avoid (or else you can’t do anything that doesn’t sufficiently appeal to normies).
Perhaps another way to put it is that even longtermists have obvious reasons to oppose SBF’s fraud (my post that you linked to suggested that it was negative-EV for longtermist goals). But I think strong longtermists should generally feel perfectly comfortable defending speculative grants that are positive-EV and the only “risk” is that others don’t judge them so positively. People are allowed to make different judgments (as long as they don’t harm anyone). Let a thousand flowers bloom, and all that.
Insofar as your real message is, “Stop doing stuff that looks weird, even if it is perfectly defensible by longtermist lights, simply because I have neartermist values and disagree with it,” then that just doesn’t actually seem like a reasonable ask?
I think that longtermism relies on more popular, evidenced-based causes like global health and animal welfare to do its reputational laundering through the EA label. I don’t see any benefit to global health and animal welfare causes from longtermism. And for that reason I think it would be better for the movement to split into “effective altruism” and “speculative altruism” so the more robust global health and animal welfare causes areas don’t have to suffer the reputational risk and criticism that is almost entirely directed at the longtermism wing.
Given the movement is essentially driven by Open Philanthropy, and they aren’t going to split, I don’t see such a large movement split happening. So I may be inclined towards some version of, as you say, “Stop doing stuff that looks weird, even if it is perfectly defensible by longtermist lights, simply because I have neartermist values and disagree with it.” The longtermist stuff is maybe like 20% of funding and 80% of reputational risk, and the most important longtermist concerns can be handled without the really weird speculative stuff.
But that’s irrelevant, because I think this ought to be a pretty clear case of the grant not being defensible by longtermist standards. Paying bay area software development salaries to develop a video game (why not a cheap developer literally anywhere else?) that didn’t even get published is hardly defensible. I get that the whole purpose of the fund is to do “hits based giving”. But it’s created an environment where nothing can be a mistake, because it is expected most things would fail. And if nothing is a mistake, how can the fund learn from mistakes?
Ok, so it sounds like your comparisons with GiveWell were an irrelevant distraction, given that you understand the point of “hits based giving”. Instead, your real question is: “why not [hire] a cheap developer literally anywhere else?”
I’m guessing the literal answer to that question is that no such cheaper developer applied for funding in the same round with an equivalent project. But we might expand upon your question: should a fund like LTFF, rather than just picking from among the proposals that come to them, try taking some of the ideas from those proposals and finding different (perhaps cheaper) PIs to develop them?
It’s possible that a more active role in developing promising longtermist projects would be a good use of their time. But I don’t find it entirely obvious the way that you seem to. A few thoughts that immediately spring to mind:
(i) My sense of that time period was that finding grantmakers was itself a major bottleneck, and given that longtermism seemed more talent-constrained than money-constrained at that time, having key people spend more time just to save some money presumably would not have seemed a wise tradeoff.
(ii) A software developer that comes to you with an idea presumably has a deeper understanding of it, and so could be expected to do a better job of it, than an external contractor to whom you have to communicate the idea. (That is, external contractors increase risk of project failure due to miscommunication or misunderstanding.)
(iii) Depending on the details, e.g. how specific the idea is, taking an idea from someone’s grant proposal to a cheaper PI might constitute intellectual theft. It certainly seems uncooperative / low-integrity, and not a good practice for grant-makers who want to encourage other high-skilled people with good ideas to apply to their fund!