Hmm, I should note that I am in strong support of quantitative models as a tool for aiding decision-making—I am only against committing ahead of time to do whatever the model tells you to do. If the post is against the use of quantitative models in general, then I do in fact disagree with the post.
Some things that feel like quantitative models that are merely “aiding” rather than “doing” decision-making:
Thank you very much for all of these thoughts. It is very interesting and I will have to read all of these links when I have the time.
I totally took the view that the EA community relies a lot on EV calculations somewhat based on vague experience without doing a full assessment of the level of reliance, which would have been ideal, so the posted examples are very useful.
*
To clarify one points:
If the post is against the use of quantitative models in general, then I do in fact disagree with the post.
I was not at all against quantitative models. Most of the DMDU stuff is quantitative models. I was arguing against the overuse of quantitative models of a particular type.
*
To answer one question
would you have been confident that the conclusion would have agreed with our prior beliefs before the report was done?
Yes. I would have been happy to say that, in general, I expect work of this type is less likely to be useful than other research work that does not try to predict the long-run future of humanity. (This is in a general sense, not considering factors like the researchers background and skills and so forth).
Yes. I would have been happy to say that, in general, I expect work of this type is less likely to be useful than other research work that does not try to predict the long-run future of humanity.
Sorry, I think I wasn’t clear. Let me make the case for the ex ante value of the Open Phil report in more detail:
1. Ex ante, it was plausible that the report would have concluded “we should not expect lots of growth in the near future”.
2. If the report had this conclusion, then we should update that AI risk is much less important than we currently think. (I am not arguing that “lots of growth ⇒ transformative AI”, I am arguing that “not much growth ⇒ no transformative AI”.)
3. This would be a very significant and important update (especially for Open Phil). It would presumably lead them to put less money into AI and more money into other areas.
4. Therefore, the report was ex ante quite valuable since it had a non-trivial chance of leading to major changes in cause prioritization.
Presumably you disagree with 1, 2, 3 or 4; I’m not sure which one.
Some things that feel like quantitative models that are merely “aiding” rather than “doing” decision-making:
Are there any particular articles/texts you would recommend?
Imo, the Greaves and MacAskill paper relies primarily on explicit calculations and speculative plausibility arguments for its positive case for strong longtermism. Of course, the paper might fit within a wider context, and there isn’t enough space to get into the details for each of the proposed interventions.
My impression is that relying on a mixture of explicit quantitative models and speculative arguments is a problem in EA generally, not unique to longtermism. Animal Charity Evaluators has been criticized a few times for this, see here and here. I’m still personally not convinced the Good Food Institute has much impact at all, since I’m not aware of a proper evaluation that didn’t depend a lot on speculation (I think this related analysis is more rigorous and justified). GiveWell has even been criticized for relying too much on quantitative models in practice, too, despite Holden’s own stated concerns with this.
Are there any particular articles/texts you would recommend?
Sorry, on what topic?
Imo, the Greaves and MacAskill paper relies primarily on explicit calculations and speculative plausibility arguments for its positive case for strong longtermism.
I see the core case of the paper as this:
… putting together the assumption that the expected size of the future is vast and the assumption that all consequences matter equally, it becomes at least plausible that the amount of ex ante good we can generate by influencing the expected course of the very long-run future exceeds the amount of ex ante good we can generate via influencing the expected course of short-run events, even after taking into account the greater uncertainty of further-future events.
They do illustrate claims like “the expected size of the future is vast” with calculations, but those are clearly illustrative; the argument is just “there’s a decent chance that humanity continues for a long time with similar or higher population levels”. I don’t think you can claim that this relies on explicit calculations except inasmuch as any reasoning that involves claims about things being “large” or “small” depends on calculations.
I also don’t see how this argument is speculative: it seems really hard to me to argue that any of the assumptions or inferences are false.
Note it is explicitly talking about the expected size of the future, and so is taking as a normative assumption that you want to maximize actual expected values. I suppose you could argue that the argument is “speculative” in that it depends on this normative assumption, but in the same way AMF is “speculative” in that it depends on the normative assumption that saving human lives is good (an assumption that may not be shared by e.g. anti-natalists or negative utilitarians).
Animal Charity Evaluators has been criticized a few times for this, see here and here.
I haven’t been following animal advocacy recently, but I remember reading “The Actual Number is Almost Surely Higher” when it was released and feeling pretty meh about it. (I’m not going to read it now, it’s too much of a time sink.)
GiveWell has even been criticized for relying too much on quantitative models in practice, too, despite Holden’s own stated concerns with this.
Yeah I also didn’t agree with this post. The optimizer’s curse tells you that you should expect your estimates to be inflated, but it does not change the actual decisions you should make. I agree somewhat more with the wrong-way reductions part, but I feel like that says “don’t treat your models as objective fact”; GiveWell frequently talks about how the cost-effectiveness model is only one input into their decision making.
More generally, I don’t think you should look at the prevalence of critiques as an indicator for how bad a thing is. Anything sufficiently important will eventually be critiqued. The question is how correct or valid those critiques are.
I’m still personally not convinced the Good Food Institute has much impact at all, since I’m not aware of a proper evaluation that didn’t depend a lot on speculation
I’m interpreting this as “I don’t have >90% confidence that GFI has actually had non-trivial impact so far (i.e. an ex-post evaluation)”. I don’t have a strong view myself since I haven’t been following GFI, but I expect even if I read a lot about GFI I’d agree with that statement.
However, if you think this should be society’s bar for investing millions of dollars, you would also have to be against many startups, nearly all VCs and angel funding, the vast majority of scientific R&D, some government megaprojects, etc. This bar seems clearly too stringent to me. You need some way of doing something like hits-based funding.
To make a strong case for strong longtermism or a particular longtermist intervention, without relying too much on quantitative models and speculation.
I see the core case of the paper as this:
(...)
I also don’t see how this argument is speculative: it seems really hard to me to argue that any of the assumptions or inferences are false.
I don’t disagree with the claim that strong longtermism is plausible, but shortermism is also plausible. The case for strong lontermism rests on actually identifying robustly positive interventions aimed at the far future, and making a strong argument that they are indeed robustly positive (and much better than shorttermist alternatives). One way of operationalizing “robustly positive” is that I may have multiple judgements of EV for different plausible worldviews, and each should be positive (although this is a high bar). I think their defences of particular longtermist interventions are speculative (including patient philanthropy), but expecting more might be unreasonable for a paper of that length which isn’t focused on any particular intervention.
I’m interpreting this as “I don’t have >90% confidence that GFI has actually had non-trivial impact so far (i.e. an ex-post evaluation)”.
Yes, and I’m also not willing to commit to any specific degree of confidence, since I haven’t seen any in particular justified. This is also for future impact. Why shouldn’t my prior for success be < 1%? Can I rule out a negative expected impact?
However, if you think this should be society’s bar for investing millions of dollars, you would also have to be against many startups, nearly all VCs and angel funding, the vast majority of scientific R&D, some government megaprojects, etc. This bar seems clearly too stringent to me. You need some way of doing something like hits-based funding.
I think in many of these cases we could develop some reasonable probability distributions to inform us (and when multiple priors are reasonable for many interventions and we have deep uncertainty, diversification might help). FHI has done some related work on the cost-effectiveness of research. It could turn out to be that the successes don’t (or ex ante won’t) justify the failures in a particular domain. Hits-based funding shouldn’t be taken for granted.
I feel like it’s misleading to take a paper that explicitly says “we show that strong longtermism is plausible”, does so via robust arguments, and conclude that longtermist EAs are basing their conclusions on speculative arguments.
If you want robust arguments for interventions you should look at those interventions. I believe there are robust arguments for work on e.g. AI risk, such as Human Compatible. (Personally, I prefer a different argument, but I think the one in HC is pretty robust and only depends on the assumption that we will build intelligent AI systems in the near-ish future, say by 2100.)
Yes, and I’m also not willing to commit to any specific degree of confidence, since I haven’t seen any in particular justified. This is also for future impact. Why shouldn’t my prior for success be < 1%? Can I rule out a negative expected impact?
Idk what’s happening with GFI, so I’m going to bow out of this discussion. (Though one obvious hypothesis is that GFI’s main funders have more information than you do.)
Hits-based funding shouldn’t be taken for granted.
I mean, of course, but it’s not like people just throw money randomly in the air. They use the sorts of arguments you’re complaining about to figure out where to try for a hit. What should they do instead? Can you show examples of that working for startups, VC funding, scientific R&D, etc? You mention two things:
Developing reasonable probability distributions
Diversification
It seems to me that longtermists are very obviously trying to do both of these things. (Also, the first one seems like the use of “explicit calculations” that you seem to be against.)
If you want robust arguments for interventions you should look at those interventions. I believe there are robust arguments for work on e.g. AI risk, such as Human Compatible.
Thank you!
I feel like it’s misleading to take a paper that explicitly says “we show that strong longtermism is plausible”, does so via robust arguments, and conclude that longtermist EAs are basing their conclusions on speculative arguments.
I’m not concluding that longtermist EAs are in general basing their conclusions on speculative arguments based on that paper, although this is my impression from a lot of what I’ve seen so far, which is admittedly not much. I’m not that familiar with the specific arguments longtermists have made, which is why I asked you for recommendations.
I think showing that longermism is plausible is also an understatement of the goal of the paper, since it only really describes section 2, and the rest of the paper aims to strengthen the argument and address objections. My main concerns are with section 3, where they argue specific interventions are actually better than a given shorttermist one. They consider objections to each of those and propose the next intervention to get past them. However, they end with the meta-option in 3.5 and speculation:
It would also need to be the case that one should be virtually certain that there will be no such actions in the future, and that there is no hope of discovering any such actions through further research. This constellation of conditions seems highly unlikely.
I think this is a Pascalian argument: we should assign some probability to eventually identifying robustly positive longtermist interventions that is large enough to make the argument go through. How large and why?
It seems to me that longtermists are very obviously trying to do both of these things. (Also, the first one seems like the use of “explicit calculations” that you seem to be against.)
I endorse the use of explicit calculations. I don’t think we should depend on a single EV calculation (including by taking weighted averages of models or other EV calculations; sensitivity analysis is a preferable). I’m interested in other quantitative approaches to decision-making as discussed in the OP.
My major reservations about strong longtermism include:
I think (causally or temporally) longer causal chains we construct are more fragile, more likely to miss other important effects, including effects that may go in the opposite direction. Feedback closer to our target outcomes and what we value terminally reduces this issue.
I think human extinction specifically could be a good thing (due to s-risks or otherwise spreading suffering through space) so interventions that would non-negligibly reduce extinction risk are not robustly good to me (not necessarily robustly negative, either, though). Of course, there are other longtermist interventions.
I am by default skeptical of the strength of causal effects without evidence, and I haven’t seen good evidence for major claims of causation I’ve seen, but I also have only started looking, and pretty passively.
Hmm, I should note that I am in strong support of quantitative models as a tool for aiding decision-making—I am only against committing ahead of time to do whatever the model tells you to do. If the post is against the use of quantitative models in general, then I do in fact disagree with the post.
Some things that feel like quantitative models that are merely “aiding” rather than “doing” decision-making:
Dear MichaelStJules and rohinmshah
Thank you very much for all of these thoughts. It is very interesting and I will have to read all of these links when I have the time.
I totally took the view that the EA community relies a lot on EV calculations somewhat based on vague experience without doing a full assessment of the level of reliance, which would have been ideal, so the posted examples are very useful.
*
To clarify one points:
I was not at all against quantitative models. Most of the DMDU stuff is quantitative models. I was arguing against the overuse of quantitative models of a particular type.
*
To answer one question
Yes. I would have been happy to say that, in general, I expect work of this type is less likely to be useful than other research work that does not try to predict the long-run future of humanity. (This is in a general sense, not considering factors like the researchers background and skills and so forth).
Sorry, I think I wasn’t clear. Let me make the case for the ex ante value of the Open Phil report in more detail:
1. Ex ante, it was plausible that the report would have concluded “we should not expect lots of growth in the near future”.
2. If the report had this conclusion, then we should update that AI risk is much less important than we currently think. (I am not arguing that “lots of growth ⇒ transformative AI”, I am arguing that “not much growth ⇒ no transformative AI”.)
3. This would be a very significant and important update (especially for Open Phil). It would presumably lead them to put less money into AI and more money into other areas.
4. Therefore, the report was ex ante quite valuable since it had a non-trivial chance of leading to major changes in cause prioritization.
Presumably you disagree with 1, 2, 3 or 4; I’m not sure which one.
Are there any particular articles/texts you would recommend?
Imo, the Greaves and MacAskill paper relies primarily on explicit calculations and speculative plausibility arguments for its positive case for strong longtermism. Of course, the paper might fit within a wider context, and there isn’t enough space to get into the details for each of the proposed interventions.
My impression is that relying on a mixture of explicit quantitative models and speculative arguments is a problem in EA generally, not unique to longtermism. Animal Charity Evaluators has been criticized a few times for this, see here and here. I’m still personally not convinced the Good Food Institute has much impact at all, since I’m not aware of a proper evaluation that didn’t depend a lot on speculation (I think this related analysis is more rigorous and justified). GiveWell has even been criticized for relying too much on quantitative models in practice, too, despite Holden’s own stated concerns with this.
Sorry, on what topic?
I see the core case of the paper as this:
They do illustrate claims like “the expected size of the future is vast” with calculations, but those are clearly illustrative; the argument is just “there’s a decent chance that humanity continues for a long time with similar or higher population levels”. I don’t think you can claim that this relies on explicit calculations except inasmuch as any reasoning that involves claims about things being “large” or “small” depends on calculations.
I also don’t see how this argument is speculative: it seems really hard to me to argue that any of the assumptions or inferences are false.
Note it is explicitly talking about the expected size of the future, and so is taking as a normative assumption that you want to maximize actual expected values. I suppose you could argue that the argument is “speculative” in that it depends on this normative assumption, but in the same way AMF is “speculative” in that it depends on the normative assumption that saving human lives is good (an assumption that may not be shared by e.g. anti-natalists or negative utilitarians).
I haven’t been following animal advocacy recently, but I remember reading “The Actual Number is Almost Surely Higher” when it was released and feeling pretty meh about it. (I’m not going to read it now, it’s too much of a time sink.)
Yeah I also didn’t agree with this post. The optimizer’s curse tells you that you should expect your estimates to be inflated, but it does not change the actual decisions you should make. I agree somewhat more with the wrong-way reductions part, but I feel like that says “don’t treat your models as objective fact”; GiveWell frequently talks about how the cost-effectiveness model is only one input into their decision making.
More generally, I don’t think you should look at the prevalence of critiques as an indicator for how bad a thing is. Anything sufficiently important will eventually be critiqued. The question is how correct or valid those critiques are.
I’m interpreting this as “I don’t have >90% confidence that GFI has actually had non-trivial impact so far (i.e. an ex-post evaluation)”. I don’t have a strong view myself since I haven’t been following GFI, but I expect even if I read a lot about GFI I’d agree with that statement.
However, if you think this should be society’s bar for investing millions of dollars, you would also have to be against many startups, nearly all VCs and angel funding, the vast majority of scientific R&D, some government megaprojects, etc. This bar seems clearly too stringent to me. You need some way of doing something like hits-based funding.
To make a strong case for strong longtermism or a particular longtermist intervention, without relying too much on quantitative models and speculation.
I don’t disagree with the claim that strong longtermism is plausible, but shortermism is also plausible. The case for strong lontermism rests on actually identifying robustly positive interventions aimed at the far future, and making a strong argument that they are indeed robustly positive (and much better than shorttermist alternatives). One way of operationalizing “robustly positive” is that I may have multiple judgements of EV for different plausible worldviews, and each should be positive (although this is a high bar). I think their defences of particular longtermist interventions are speculative (including patient philanthropy), but expecting more might be unreasonable for a paper of that length which isn’t focused on any particular intervention.
Yes, and I’m also not willing to commit to any specific degree of confidence, since I haven’t seen any in particular justified. This is also for future impact. Why shouldn’t my prior for success be < 1%? Can I rule out a negative expected impact?
I think in many of these cases we could develop some reasonable probability distributions to inform us (and when multiple priors are reasonable for many interventions and we have deep uncertainty, diversification might help). FHI has done some related work on the cost-effectiveness of research. It could turn out to be that the successes don’t (or ex ante won’t) justify the failures in a particular domain. Hits-based funding shouldn’t be taken for granted.
I feel like it’s misleading to take a paper that explicitly says “we show that strong longtermism is plausible”, does so via robust arguments, and conclude that longtermist EAs are basing their conclusions on speculative arguments.
If you want robust arguments for interventions you should look at those interventions. I believe there are robust arguments for work on e.g. AI risk, such as Human Compatible. (Personally, I prefer a different argument, but I think the one in HC is pretty robust and only depends on the assumption that we will build intelligent AI systems in the near-ish future, say by 2100.)
Idk what’s happening with GFI, so I’m going to bow out of this discussion. (Though one obvious hypothesis is that GFI’s main funders have more information than you do.)
I mean, of course, but it’s not like people just throw money randomly in the air. They use the sorts of arguments you’re complaining about to figure out where to try for a hit. What should they do instead? Can you show examples of that working for startups, VC funding, scientific R&D, etc? You mention two things:
Developing reasonable probability distributions
Diversification
It seems to me that longtermists are very obviously trying to do both of these things. (Also, the first one seems like the use of “explicit calculations” that you seem to be against.)
Thank you!
I’m not concluding that longtermist EAs are in general basing their conclusions on speculative arguments based on that paper, although this is my impression from a lot of what I’ve seen so far, which is admittedly not much. I’m not that familiar with the specific arguments longtermists have made, which is why I asked you for recommendations.
I think showing that longermism is plausible is also an understatement of the goal of the paper, since it only really describes section 2, and the rest of the paper aims to strengthen the argument and address objections. My main concerns are with section 3, where they argue specific interventions are actually better than a given shorttermist one. They consider objections to each of those and propose the next intervention to get past them. However, they end with the meta-option in 3.5 and speculation:
I think this is a Pascalian argument: we should assign some probability to eventually identifying robustly positive longtermist interventions that is large enough to make the argument go through. How large and why?
I endorse the use of explicit calculations. I don’t think we should depend on a single EV calculation (including by taking weighted averages of models or other EV calculations; sensitivity analysis is a preferable). I’m interested in other quantitative approaches to decision-making as discussed in the OP.
My major reservations about strong longtermism include:
I think (causally or temporally) longer causal chains we construct are more fragile, more likely to miss other important effects, including effects that may go in the opposite direction. Feedback closer to our target outcomes and what we value terminally reduces this issue.
I think human extinction specifically could be a good thing (due to s-risks or otherwise spreading suffering through space) so interventions that would non-negligibly reduce extinction risk are not robustly good to me (not necessarily robustly negative, either, though). Of course, there are other longtermist interventions.
I am by default skeptical of the strength of causal effects without evidence, and I haven’t seen good evidence for major claims of causation I’ve seen, but I also have only started looking, and pretty passively.
Yeah, that’s a fair point, sorry for the bad argument.