I’m very excited to read this sequence! I have a few thoughts. Not sure how valid or insightful they are but thought I’d put them out there:
On going beyond EVM / risk neutrality
The motivation for investigating alternatives to EVM seems to be that EVM has some counterintuitive implications. I’m interested in the meta question of how much we should be swayed by counterintuitive conclusions when EVM seems to be so well-motivated (e.g. VNM theorem), and the fact that we know we are prone to biases and cognitive difficulties with large numbers.
Would alternatives to EVM also have counterintuitive conclusions? How counterintuitive?
The motivation for incorporating risk aversion also seems driven by our intuition, but it’s worth remembering the problems with rejecting risk neutrality e.g. being risk averse sometimes means choosing an action that is stochastically dominated. Again, what are the problems with the alternatives and how serious are they?
On choosing other cause areas over reducing x-risk
As it stands I struggle to justify GHD work at all on cluelessness grounds. GiveWell-type analyses ignore a lot of foreseeable indirect effects of the interventions e.g. those on non-human animals. It isn’t clear to me that GHD work is net positive. I’d be interested in further work on this important point given how much money is given to GHD interventions in the community.
Not all x-risk is the same. Are there specific classes of x-risk that are pretty robust to issues people have raised to ‘x-risk as a whole’? For example might s-risks—those that deal with outcomes far worse than extinction—be pretty robust? Are certain interventions, such as expanding humanity’s moral circle or boosting economic growth, robustly positive and better than alternatives such as GHD even if the Time of Perils hypothesis isn’t true? I’m genuinely not sure, but I know I don’t feel comfortable lumping all x-risk / all x-risk interventions together in one bucket.
As it stands I struggle to justify GHD work at all on cluelessness grounds. GiveWell-type analyses ignore a lot of foreseeable indirect effects of the interventions e.g. those on non-human animals.
I support most of this comment, but strongly disagree with this, or at least think it’s much too strong. Cluelessness isn’t a categorical property which some interventions have and some don’t—it’s a question of how much to moderate your confidence in a given decision. Far from being the unanswerable question Greaves suggests, it seems reasonable to me to do any or all of the following:
Assume unknown unknowns pan out to net 0
Give credences on a range of known unknowns
Time-limit the above process in some way, and give an overall best guess expectation for remaining semi-unknowns
Act based on the numbers you have from above process when you stop
Incorporate some form of randomness in the criteria you investigate
If you’re not willing to do something like the above, you lose the ability to predict anything, including supposedly long-termist interventions, which are all mired in their own uncertainties.
So while one might come to the view that GHD is in fact bad because of eg the poor meat eater problem, it seems irrational to be agnostic on the question, unless you’re comparably agnostic towards every other cause.
I think you might be right that Greaves is too strong on this and I’ll admit I’m still quite uncertain about exactly how cluelessness cashes out. However, I know I have difficulties funding GHD work (and would even if there was nothing else to fund), but that I don’t have similar difficulties for certain longtermist interventions. I’ll try to explain.
I don’t want to fund GHD work because it’s just very plausible that the animal suffering might outweigh the human benefit. Some have called for development economists to consider the welfare of non-human animals. Despite this, GiveWell hasn’t yet done this (I’m not criticising—I know it’s tricky). I think it’s possible for a detailed analysis to make me happy to give to GHD interventions (over burning the money), but we aren’t there yet.
On the other hand, there are certain interventions that either:
Have no plausible foreseeable downsides, or
For which I am pretty confident the upsides outweigh the downsides in expectation.
For example:
Technical AI alignment research / AI governance and coordination research: I struggle to come up with a story why this might be bad. Maybe the story is that it would slow down progress and delay benefits from AI, but the typical Bostrom astronomical waste argument combined with genuine concerns about AI safety from experts debunks this story for me. I am left feeling pretty confident that funding technical AI alignment research is net positive in expectation.
Expanding our moral circle: Again, I just struggle to come up with a story why this might be bad. Of course, poor advocacy can be counterproductive which means we should be careful and not overly dogmatic/annoying. Spreading the ideas and advocating in a careful, thoughtful manner just seems robustly positive to me. Upsides seem very large given risks of lock-in of negative outcomes for non-humans (e.g. non-human animals or digital sentience).
Some other interventions I don’t think I’m clueless about include:
Global priorities research
Growing EA/longtermism (I could be convinced it’s bad to try to grow EA given recent negative publicity)
I have no particular reason to think you shouldn’t believe in any of those claims, but fwiw I find it quite plausible (though wouldn’t care to give particular credences atm) that at least some of them could be bad, eg:
Technical AI safety seems to have been the impetus for various organisations who are working on AI capabilities in a way that everyone except them seems to think is net negative (OpenAI, Deepmind, Anthropic, maybe others). Also, if humans end up successfully limiting AI by our own preferences, that could end up being a moral catastrophe all of its own.
‘Expanding our moral circle’ sounds nice, but without a clear definition of the morality involved it’s pretty vague what it means—and with such a definition, it could cash out as ‘make people believe our moral views’, which doesn’t have a great history.
Investing for the future could put a great deal of undemocratic power into the hands of a small group of people whose values could shift (or turn out to be ‘wrong’) over time.
And all of these interventions just cost a lot of money, something which the EA movement seems very short on recently.
I don’t buy the argument that AI safety is in some way responsible for dangerous AI capabilities. Even if the concept of AI safety had never been raised I’m pretty sure we would still have had AI orgs pop up.
Also yes it is possible that working on AI Safety could limit AI and be a catastrophe in terms of lost welfare, but I still think AI safety work is net positive in expectation given the Bostrom astronomical waste argument and genuine concerns about AI risk from experts.
The key point here is that cluelessness doesn’t arise just because we can think of ways an intervention could be both good and bad—it arises when we really struggle to weigh these competing effects. In the case of AI safety, I don’t struggle to weigh them.
Expanding moral circle for me would be expanding to anything that is sentient or has the capacity for welfare.
As for investing for the future, you can probably mitigate those risks. Again though my point stands that, even if that is a legitimate worry, I can try to weigh that risk against the benefit. I personally feel fine in determining that, overall, investing funds for future use that are ‘promised for altruistic purposes’ seems net positive in expectation. We can debate that point of course, but that’s my assessment.
I think at this point we can amicably disagree, though I’m curious why you think the ‘more people = more animals exploited’ philosophy applies to people in Africa, but not in the future. One might hope that we learn to do better, but it seems like that hope could be applied to and criticised in either scenario.
I do worry about future animal suffering. It’s partly for that reason that I’m less concerned about reducing risks of extinction than I am about reducing other existential risks that will result in large amounts of suffering in the future. This informed some of my choices of interventions for which I am ‘not clueless about’. E.g.
Technical AI alignment / AI governance and coordination research: it has been suggested that misaligned AI could be a significant s-risk.
Expanding our moral circle: relevance to future suffering should be obvious.
Global priorities research: this just seems robustly good as how can increasing moral understanding be bad?
Research into consciousness: seems really important in light of the potential risk of future digital minds suffering.
Research into improving mental health: improving mental health has intrinsic worth and I don’t see a clear link to increasing future suffering (in fact I lean towards thinking happier people/societies are less likely to act in morally outrageous ways).
I do lean towards thinking reducing extinction risk is net positive in expectation too, but I am quite uncertain about this and I don’t let it motivate my personal altruistic choices.
Thanks for engaging, Jack! As you’d expect, we can’t tackle everything in a single sequence; so, you won’t get our answers to all your questions here. We say a bit more about the philosophical issues associated with going beyond EVM in this supplementary document, but since our main goal is to explore the implications of alternatives to EVM, we’re largely content to motivate those alternatives without arguing for them at length.
Re: GHD work and cluelessness, I hear the worry. We’d like to think about this more ourselves. Here’s hoping we’re able to do some work on it in the future.
Re: not all x-risk being the same, fair point. We largely focus on extinction risk and do try to flag as much in each report.
Thanks for your response, I’m excited to see your sequence. I understand you can’t cover everything of interest, but maybe my comments give ideas as to where you could do some further work.
As it stands I struggle to justify GHD work at all on cluelessness grounds. GiveWell-type analyses ignore a lot of foreseeable indirect effects of the interventions e.g. those on non-human animals. It isn’t clear to me that GHD work is net positive.
Would you mind expanding a bit on why this applies to GHD and not other cause areas please? E.g.: wouldn’t your concerns about animal welfare from GHD work also apply to x-risk work?
I’m interested in the meta question of how much we should be swayed by counterintuitive conclusions when EVM seems to be so well-motivated (e.g. VNM theorem), and the fact that we know we are prone to biases and cognitive difficulties with large numbers.
I’ve been interested in this as well, and I consider Holden’s contra arguments in Sequence thinking vs. cluster thinking persuasive in changing my mind for decision guidance in practice (e.g. at the “implementation level” of personally donating to actual x-risk mitigation funds) -- I’d be curious to know if you have a different reaction.
Edited to add: I just realized who I’m replying to, so I wanted to let you know that your guided cause prio flowchart was a key input at the start of my own mid-career pivot, and I’ve been sharing it from time to time with other people. In that post you wrote
my (ambitious) vision would be for such a flowchart to be used widely by new EAs to help them make an informed decision on cause area, ultimately improving the allocation of EAs to cause areas.
and if I’m interpreting your followup comment correctly, it’s sad to see little interest in such a good first-cut distillation. Here’s hoping interest picks up going forward.
I’ll need to find some time to read that Holden post.
I’m happy that the flowchart was useful to you! I might consider working on it in the future, but I think the issues are that I’m not convinced many people would use it and that the actual content of the flowchart might be pretty contentious—so it would be easy to be accused of being biased. I was using my karma score as a signal of if I should continue with it, and karma wasn’t impressive.
I’m very excited to read this sequence! I have a few thoughts. Not sure how valid or insightful they are but thought I’d put them out there:
On going beyond EVM / risk neutrality
The motivation for investigating alternatives to EVM seems to be that EVM has some counterintuitive implications. I’m interested in the meta question of how much we should be swayed by counterintuitive conclusions when EVM seems to be so well-motivated (e.g. VNM theorem), and the fact that we know we are prone to biases and cognitive difficulties with large numbers.
Would alternatives to EVM also have counterintuitive conclusions? How counterintuitive?
The motivation for incorporating risk aversion also seems driven by our intuition, but it’s worth remembering the problems with rejecting risk neutrality e.g. being risk averse sometimes means choosing an action that is stochastically dominated. Again, what are the problems with the alternatives and how serious are they?
On choosing other cause areas over reducing x-risk
As it stands I struggle to justify GHD work at all on cluelessness grounds. GiveWell-type analyses ignore a lot of foreseeable indirect effects of the interventions e.g. those on non-human animals. It isn’t clear to me that GHD work is net positive. I’d be interested in further work on this important point given how much money is given to GHD interventions in the community.
Not all x-risk is the same. Are there specific classes of x-risk that are pretty robust to issues people have raised to ‘x-risk as a whole’? For example might s-risks—those that deal with outcomes far worse than extinction—be pretty robust? Are certain interventions, such as expanding humanity’s moral circle or boosting economic growth, robustly positive and better than alternatives such as GHD even if the Time of Perils hypothesis isn’t true? I’m genuinely not sure, but I know I don’t feel comfortable lumping all x-risk / all x-risk interventions together in one bucket.
I support most of this comment, but strongly disagree with this, or at least think it’s much too strong. Cluelessness isn’t a categorical property which some interventions have and some don’t—it’s a question of how much to moderate your confidence in a given decision. Far from being the unanswerable question Greaves suggests, it seems reasonable to me to do any or all of the following:
Assume unknown unknowns pan out to net 0
Give credences on a range of known unknowns
Time-limit the above process in some way, and give an overall best guess expectation for remaining semi-unknowns
Act based on the numbers you have from above process when you stop
Incorporate some form of randomness in the criteria you investigate
If you’re not willing to do something like the above, you lose the ability to predict anything, including supposedly long-termist interventions, which are all mired in their own uncertainties.
So while one might come to the view that GHD is in fact bad because of eg the poor meat eater problem, it seems irrational to be agnostic on the question, unless you’re comparably agnostic towards every other cause.
I think you might be right that Greaves is too strong on this and I’ll admit I’m still quite uncertain about exactly how cluelessness cashes out. However, I know I have difficulties funding GHD work (and would even if there was nothing else to fund), but that I don’t have similar difficulties for certain longtermist interventions. I’ll try to explain.
I don’t want to fund GHD work because it’s just very plausible that the animal suffering might outweigh the human benefit. Some have called for development economists to consider the welfare of non-human animals. Despite this, GiveWell hasn’t yet done this (I’m not criticising—I know it’s tricky). I think it’s possible for a detailed analysis to make me happy to give to GHD interventions (over burning the money), but we aren’t there yet.
On the other hand, there are certain interventions that either:
Have no plausible foreseeable downsides, or
For which I am pretty confident the upsides outweigh the downsides in expectation.
For example:
Technical AI alignment research / AI governance and coordination research: I struggle to come up with a story why this might be bad. Maybe the story is that it would slow down progress and delay benefits from AI, but the typical Bostrom astronomical waste argument combined with genuine concerns about AI safety from experts debunks this story for me. I am left feeling pretty confident that funding technical AI alignment research is net positive in expectation.
Expanding our moral circle: Again, I just struggle to come up with a story why this might be bad. Of course, poor advocacy can be counterproductive which means we should be careful and not overly dogmatic/annoying. Spreading the ideas and advocating in a careful, thoughtful manner just seems robustly positive to me. Upsides seem very large given risks of lock-in of negative outcomes for non-humans (e.g. non-human animals or digital sentience).
Some other interventions I don’t think I’m clueless about include:
Global priorities research
Growing EA/longtermism (I could be convinced it’s bad to try to grow EA given recent negative publicity)
Investing for the future
Research into consciousness
Research into improving mental health
I have no particular reason to think you shouldn’t believe in any of those claims, but fwiw I find it quite plausible (though wouldn’t care to give particular credences atm) that at least some of them could be bad, eg:
Technical AI safety seems to have been the impetus for various organisations who are working on AI capabilities in a way that everyone except them seems to think is net negative (OpenAI, Deepmind, Anthropic, maybe others). Also, if humans end up successfully limiting AI by our own preferences, that could end up being a moral catastrophe all of its own.
‘Expanding our moral circle’ sounds nice, but without a clear definition of the morality involved it’s pretty vague what it means—and with such a definition, it could cash out as ‘make people believe our moral views’, which doesn’t have a great history.
Investing for the future could put a great deal of undemocratic power into the hands of a small group of people whose values could shift (or turn out to be ‘wrong’) over time.
And all of these interventions just cost a lot of money, something which the EA movement seems very short on recently.
I don’t buy the argument that AI safety is in some way responsible for dangerous AI capabilities. Even if the concept of AI safety had never been raised I’m pretty sure we would still have had AI orgs pop up.
Also yes it is possible that working on AI Safety could limit AI and be a catastrophe in terms of lost welfare, but I still think AI safety work is net positive in expectation given the Bostrom astronomical waste argument and genuine concerns about AI risk from experts.
The key point here is that cluelessness doesn’t arise just because we can think of ways an intervention could be both good and bad—it arises when we really struggle to weigh these competing effects. In the case of AI safety, I don’t struggle to weigh them.
Expanding moral circle for me would be expanding to anything that is sentient or has the capacity for welfare.
As for investing for the future, you can probably mitigate those risks. Again though my point stands that, even if that is a legitimate worry, I can try to weigh that risk against the benefit. I personally feel fine in determining that, overall, investing funds for future use that are ‘promised for altruistic purposes’ seems net positive in expectation. We can debate that point of course, but that’s my assessment.
I think at this point we can amicably disagree, though I’m curious why you think the ‘more people = more animals exploited’ philosophy applies to people in Africa, but not in the future. One might hope that we learn to do better, but it seems like that hope could be applied to and criticised in either scenario.
I do worry about future animal suffering. It’s partly for that reason that I’m less concerned about reducing risks of extinction than I am about reducing other existential risks that will result in large amounts of suffering in the future. This informed some of my choices of interventions for which I am ‘not clueless about’. E.g.
Technical AI alignment / AI governance and coordination research: it has been suggested that misaligned AI could be a significant s-risk.
Expanding our moral circle: relevance to future suffering should be obvious.
Global priorities research: this just seems robustly good as how can increasing moral understanding be bad?
Research into consciousness: seems really important in light of the potential risk of future digital minds suffering.
Research into improving mental health: improving mental health has intrinsic worth and I don’t see a clear link to increasing future suffering (in fact I lean towards thinking happier people/societies are less likely to act in morally outrageous ways).
I do lean towards thinking reducing extinction risk is net positive in expectation too, but I am quite uncertain about this and I don’t let it motivate my personal altruistic choices.
Thanks for engaging, Jack! As you’d expect, we can’t tackle everything in a single sequence; so, you won’t get our answers to all your questions here. We say a bit more about the philosophical issues associated with going beyond EVM in this supplementary document, but since our main goal is to explore the implications of alternatives to EVM, we’re largely content to motivate those alternatives without arguing for them at length.
Re: GHD work and cluelessness, I hear the worry. We’d like to think about this more ourselves. Here’s hoping we’re able to do some work on it in the future.
Re: not all x-risk being the same, fair point. We largely focus on extinction risk and do try to flag as much in each report.
Thanks for your response, I’m excited to see your sequence. I understand you can’t cover everything of interest, but maybe my comments give ideas as to where you could do some further work.
Would you mind expanding a bit on why this applies to GHD and not other cause areas please? E.g.: wouldn’t your concerns about animal welfare from GHD work also apply to x-risk work?
I’ll direct you to my response to Arepo
I’ve been interested in this as well, and I consider Holden’s contra arguments in Sequence thinking vs. cluster thinking persuasive in changing my mind for decision guidance in practice (e.g. at the “implementation level” of personally donating to actual x-risk mitigation funds) -- I’d be curious to know if you have a different reaction.
Edited to add: I just realized who I’m replying to, so I wanted to let you know that your guided cause prio flowchart was a key input at the start of my own mid-career pivot, and I’ve been sharing it from time to time with other people. In that post you wrote
and if I’m interpreting your followup comment correctly, it’s sad to see little interest in such a good first-cut distillation. Here’s hoping interest picks up going forward.
I’ll need to find some time to read that Holden post.
I’m happy that the flowchart was useful to you! I might consider working on it in the future, but I think the issues are that I’m not convinced many people would use it and that the actual content of the flowchart might be pretty contentious—so it would be easy to be accused of being biased. I was using my karma score as a signal of if I should continue with it, and karma wasn’t impressive.