I find this post really uncomfortable. I thought I’d mention that, even though I’m having a hard time putting my finger on why. I’ll give a few guesses as a reply to this comment so they can be discussed separately.
I personally feel queasy about telling people that they might die in some detail as an epistemic move, even when it’s true. I’m worried that this will move people to be less playful and imaginative in their thinking, and make worse intellectual, project, or career decisions overall, compared to more abstract/less concrete considerations of “how can I, a comfortable and privileged person with most of my needs already met, do the most good.”
(I’m not saying my considerations here are overwhelming, or even that well-thought-out).
I am inclined to agree, but to take the other side for a minute, if I imagine being very confident it was true, I feel like there would be important truths that I would only realize if I communicated that knowledge deep into my System 1.
I’m also confused about this. One framing I sometimes have is
If I were a general in a war, or a scientist on the Manhattan project, will I act similarly?
And my guess is that I won’t act the same way. Like I’d care less about work-life balance, be more okay with life not going as well, be more committed with x-risk stuff always being the top of my mind, etc. And framed that way, I feel more attuned to the change of S1 “getting it” being positive.
I’m worried that this will move people to be less playful and imaginative in their thinking, and make worse intellectual, project, or career decisions overall, compared to more abstract/less concrete considerations of “how can I, a comfortable and privileged person with most of my needs already met, do the most good.”
Interesting, can you say more? I have the opposite intuition, though here this stems from the specific failure mode of considering AI Safety as weird, speculative, abstract, and only affecting the long-term future—I think this puts it at a significant disadvantage compared to more visceral and immediate forms of doing good, and that this kind of post can help partially counter that bias.
Hmm I want to say that I’m not at all certain what my all-things-considered position on this is.
But I think there are several potential cruxes that I’m framing the question around:
In general, is it good for “the project of doing good,” or AI safety in particular, for people to viscerallyfeel (rather than just intellectually manipulate) the problems?
Is the feeling that you personally will die (or otherwise you have extremely direct incentives for the outcome) net beneficial or negative for thinking or action around AI safety?
What are the outreach/epistemics benefits and costs of thinking about AI killing everyone on a visceral level?
I have a bunch of thoughts on each of those, but they’re kinda jumbled and not very coherent at the moment.
One possible related framing is “what types of people/thinking styles/archetypes does the EA or AI safety community most need*?” Do we need:
Soldiers?
Rough definition: People who’re capable and willing to “do what needs to be done.” Can be pointed along a well-defined goal and execute well according to it.
In this world, most of the problems are well-known. We may not be strong enough enough to solve the problems, but we know what the problems are. It just takes grit and hardened determination and occasionally local knowledge to solve the problems well.
What I like about this definition: Emphasizes grit. Emphasizes that often it just requires people willing to do the important grunt work, and sacrifice prestige games to do the most important thing (control-F for “enormous amounts of shit”)
Potential failure modes: Intellectual stagnation. “Soldier mindset” in the bad sense of the term.
Philosophers?
Rough definition: Thinkers willing to play with and toy around with lots of possibilities. Are often bad at keeping their “eye on the ball,” and perhaps general judgement, but good at spontaneous and creative intellectual jumps.
In this world, we may have dire problems, but the most important ones are poorly understood. We have a lot of uncertainty over what the problems are even there, never mind how to solve them.
What I like about this definition: Emphasizes the uncertain nature of the problems, and the need for creativity in thought to fix the issues.
Potential failure modes: Focusing on interesting problems over important problems. Too much abstraction or “meta.”
Rough definition: Thinkers/strategists willing to consider a large range of abstractions in the service of a (probably) just goal.
In this world, we have moral problems in the world, and we have a responsibility to fix it. This can’t entirely be solved on pure grit, and requires careful judgements of risk, morale, logistics, ethics, etc. But still there’s clear goals in mind, and “eye on the ball” is really important to achieving such goals. There’s also a lot of responsibilities (if you fail, your people die. Worse, your side might lose the war and fascists/communists/whatever might take over).
What I like about this definition: Emphasizes a balance between thoughtfulness and determination.
Potential failure modes: Fighting the “wrong” war (most wars are probably bad). Prematurely abdicating responsibility for higher-level questions in favor of what’s needed to win.
Something else?
This is my current guess of what we need. “Generals” is an appealing aesthetic, but I think the problems aren’t well-defined enough, and our understanding of how to approach them too ill-formed, that thinking of us as generals in a moral war is too premature.
In the above archetypes, I feel good about “visceralness” for soldiers and maybe generals, but not for philosophers. I think I feel bad about “contemplating your own death” for all three, but especially philosophers and generals (A general who obsesses over their own death probably will make more mistakes because they aren’t trying as hard to win).
Perhaps I’m wrong. Other general-like archetypes I’ve considered is “scientists on the Manhattan Project,” and I feel warmer about Manhattan Project scientists having a visceral sense of their own death than for generals. Perhaps I’d be interested in reading about actual scientists trying to solve problems that have a high probability of affecting them one day (e.g. aging, cancer and heart disease researchers). Do they find the thought that their failures may be causally linked to their own death motivating or just depressing?
One reason I might be finding this post uncomfortable is that I’m pretty concerned about the mental health of many young EAs, and frankly for some people I met I’m more worried about the chance of them dying from suicide or risky activities over the next decade than from x-risks. Unfortunately I think there is also a link between people who are very focused on death by x-risk and poor mental health. This is an intuition, nothing more.
I share this concern, and this was my biggest hesitation to making this post. I’m open to the argument that this post was pretty net bad because of that.
If you’re finding things like existential dread concerning, I’ll flag that the numbers in this post are actually fairly low in the grand scheme of total risks to you over your life − 3.7% just isn’t that high. Dying young just isn’t that likely.
One reason I might be finding this post uncomfortable is the chart it’s centered around.
The medical information is based on real people who have died recently. It’s a forecast based on counting. We can have a lot of confidence in those numbers.
In contrast, the AI numbers are trying to predict something that’s never happened before. It’s worth trying to predict, but the numbers are very different, and we can’t have much confidence in them especially for one particular year.
It feels kind of misleading to try to put these two very different kinds of numbers side by side as if they’re directly comparable.
I fairly strongly disagree with this take on two counts:
The life expectancy numbers are not highly robust. They’re naively extrapolating out the current rate of death in the UK to the future. This is a pretty dodgy methodology! I’m assuming that medical technology won’t expand, that AI won’t accelerate biotech research, that longevity research doesn’t go anywhere, that we don’t have disasters like a much worse pandemic or nuclear war, that there won’t be new major public health hazards that disproportionately affect young people, that climate change won’t substantially affect life expectancy in the rich world, that there won’t be major enough wars to affect life expectancy in the UK, etc. The one thing that we know won’t happen in the future is the status quo.
I agree that it’s less dodgy than the AI numbers, but it’s still on a continuum, rather than some ontological difference between legit numbers and non-legit numbers.
Leaving that aside, I think it’s extremely reasonable to compare high confidence and low confidence numbers so long as they’re trying to measure the same thing. The key thing is that low confidence numbers aren’t low confidence in any particular direction (if they were, we’d change to a different estimate). Maybe the AI x-risk numbers are way higher, maybe they’re way lower. They’re definitely noisier, but the numbers mean fundamentally the same thing, and are directly comparable. And comparing numbers like this is part of the process of understanding the implications of your models of the future, even if they are fairly messy and uncertain models.
Of course, it’s totally reasonable to disagree with the models used for these questions and think that eg they have major systematic biases towards exaggerating AI probabilities. That should just give you different numbers to put into this model.
As a concrete example, I’d like governments to be able to compare the risks of a nuclear war to their citizens lives, vs other more mundane risks, and to figure out cost-effectiveness accordingly. Nuclear wars have never happened in something remotely comparable to today’s geopolitical climate, so any models here will be inherently uncertain and speculative, but it seems pretty important to be able to answer questions like this regardless.
I disagree because I think error bounds over probabilities are less principled than a lot of people assume, and they can add a bunch of false confidence.
Yes. Quantitive expressions of credal resilience is complicated, there isn’t a widely-shared-upon formulation, and a lot of people falsely assume that error bounds on made-up probabilities are more “rigorous” or “objective” than the probabilities themselves.
The issue is that by putting together high-confidence (relatively) and low-confidence estimates in your calculation, your resulting numbers should be low-confidence. For example, if your error bounds for AI risk vary by an order of magnitude each way (which is frankly insanely small for something this speculative) then the error bounds in your relative risk estimate would give you a value between 0.6% and 87%. With an error range like this, I don’t think the statement “my most likely reason to die young is AI x-risk” is justified.
Hmm. I agree that these numbers are low confidence. But for the purpose of acting and forming conclusions from this, I’m not sure what you think is a better approach (beyond saying that more resources should be put into becoming more confident, which I broadly agree with).
Do you think I can never make statements like “low confidence proposition X is more likely than high confidence proposition Y”? What would feel like a reasonable criteria for being able to say that kind of thing?
More generally, I’m not actually sure what you’re trying to capture with error bounds—what does it actually mean to say that P(AI X-risk) is in [0.5%, 50%] rather than 5%? What is this a probability distribution over? I’m estimating a probability, not a quantity. I’d be open to the argument that the uncertainty comes from ‘what might I think if I thought about this for much longer’.
I’ll also note that the timeline numbers are a distribution over years, which is already implicitly including a bunch of uncertainty plus some probability over AI never. Though obviously it could include more. The figure for AI x-risk is a point estimate, which is much dodgier.
And I’ll note again that the natural causes numbers are at best medium confidence, since they assume the status quo continues!
would give you a value between 0.6% and 87%
Nitpick: I think you mean 6%? (0.37/(0.37+5.3) = 0.06). Obviously this doesn’t change your core point.
Do you think I can never make statements like “low confidence proposition X is more likely than high confidence proposition Y”? What would feel like a reasonable criteria for being able to say that kind of thing?
Honestly, yeah, I think it is a weird statement to definitively state that X wildly speculative thing is more likely than Y well known and studied thing (or to put it differently, when the error bounds of X are orders of magnitude different from the error bounds in Y). It might help if you provided a counterexample here? I think my objections here might be partially on the semantics, saying “X is more likely than Y” seems like a smuggling of certainty into a very uncertain proposition.
what does it actually mean to say that P(AI X-risk) is in [0.5%, 50%] rather than 5%
I think it elucidates more accurately the state of knowledge about the situation, which is that you don’t know much at all.
I don’t disagree with the premise that agreeing on empirical beliefs about AI probably matters more for whether someone does AI safety work than philosophical beliefs. I’ve made that argument before!
I find this post really uncomfortable. I thought I’d mention that, even though I’m having a hard time putting my finger on why. I’ll give a few guesses as a reply to this comment so they can be discussed separately.
I personally feel queasy about telling people that they might die in some detail as an epistemic move, even when it’s true. I’m worried that this will move people to be less playful and imaginative in their thinking, and make worse intellectual, project, or career decisions overall, compared to more abstract/less concrete considerations of “how can I, a comfortable and privileged person with most of my needs already met, do the most good.”
(I’m not saying my considerations here are overwhelming, or even that well-thought-out).
I am inclined to agree, but to take the other side for a minute, if I imagine being very confident it was true, I feel like there would be important truths that I would only realize if I communicated that knowledge deep into my System 1.
I’m also confused about this. One framing I sometimes have is
And my guess is that I won’t act the same way. Like I’d care less about work-life balance, be more okay with life not going as well, be more committed with x-risk stuff always being the top of my mind, etc. And framed that way, I feel more attuned to the change of S1 “getting it” being positive.
Interesting, can you say more? I have the opposite intuition, though here this stems from the specific failure mode of considering AI Safety as weird, speculative, abstract, and only affecting the long-term future—I think this puts it at a significant disadvantage compared to more visceral and immediate forms of doing good, and that this kind of post can help partially counter that bias.
Hmm I want to say that I’m not at all certain what my all-things-considered position on this is.
But I think there are several potential cruxes that I’m framing the question around:
In general, is it good for “the project of doing good,” or AI safety in particular, for people to viscerally feel (rather than just intellectually manipulate) the problems?
Is the feeling that you personally will die (or otherwise you have extremely direct incentives for the outcome) net beneficial or negative for thinking or action around AI safety?
What are the outreach/epistemics benefits and costs of thinking about AI killing everyone on a visceral level?
I have a bunch of thoughts on each of those, but they’re kinda jumbled and not very coherent at the moment.
One possible related framing is “what types of people/thinking styles/archetypes does the EA or AI safety community most need*?” Do we need:
Soldiers?
Rough definition: People who’re capable and willing to “do what needs to be done.” Can be pointed along a well-defined goal and execute well according to it.
In this world, most of the problems are well-known. We may not be strong enough enough to solve the problems, but we know what the problems are. It just takes grit and hardened determination and occasionally local knowledge to solve the problems well.
What I like about this definition: Emphasizes grit. Emphasizes that often it just requires people willing to do the important grunt work, and sacrifice prestige games to do the most important thing (control-F for “enormous amounts of shit”)
Potential failure modes: Intellectual stagnation. “Soldier mindset” in the bad sense of the term.
Philosophers?
Rough definition: Thinkers willing to play with and toy around with lots of possibilities. Are often bad at keeping their “eye on the ball,” and perhaps general judgement, but good at spontaneous and creative intellectual jumps.
In this world, we may have dire problems, but the most important ones are poorly understood. We have a lot of uncertainty over what the problems are even there, never mind how to solve them.
What I like about this definition: Emphasizes the uncertain nature of the problems, and the need for creativity in thought to fix the issues.
Potential failure modes: Focusing on interesting problems over important problems. Too much abstraction or “meta.”
Generals?
Rough definition: Thinkers/strategists willing to consider a large range of abstractions in the service of a (probably) just goal.
In this world, we have moral problems in the world, and we have a responsibility to fix it. This can’t entirely be solved on pure grit, and requires careful judgements of risk, morale, logistics, ethics, etc. But still there’s clear goals in mind, and “eye on the ball” is really important to achieving such goals. There’s also a lot of responsibilities (if you fail, your people die. Worse, your side might lose the war and fascists/communists/whatever might take over).
What I like about this definition: Emphasizes a balance between thoughtfulness and determination.
Potential failure modes: Fighting the “wrong” war (most wars are probably bad). Prematurely abdicating responsibility for higher-level questions in favor of what’s needed to win.
Something else?
This is my current guess of what we need. “Generals” is an appealing aesthetic, but I think the problems aren’t well-defined enough, and our understanding of how to approach them too ill-formed, that thinking of us as generals in a moral war is too premature.
In the above archetypes, I feel good about “visceralness” for soldiers and maybe generals, but not for philosophers. I think I feel bad about “contemplating your own death” for all three, but especially philosophers and generals (A general who obsesses over their own death probably will make more mistakes because they aren’t trying as hard to win).
Perhaps I’m wrong. Other general-like archetypes I’ve considered is “scientists on the Manhattan Project,” and I feel warmer about Manhattan Project scientists having a visceral sense of their own death than for generals. Perhaps I’d be interested in reading about actual scientists trying to solve problems that have a high probability of affecting them one day (e.g. aging, cancer and heart disease researchers). Do they find the thought that their failures may be causally linked to their own death motivating or just depressing?
*as a method to suss out both selection: who should we most try to attract? And training: which virtues/mindsets is it most important to cultivate?
Yeah, this resonates with me as well.
One reason I might be finding this post uncomfortable is that I’m pretty concerned about the mental health of many young EAs, and frankly for some people I met I’m more worried about the chance of them dying from suicide or risky activities over the next decade than from x-risks. Unfortunately I think there is also a link between people who are very focused on death by x-risk and poor mental health. This is an intuition, nothing more.
I share this concern, and this was my biggest hesitation to making this post. I’m open to the argument that this post was pretty net bad because of that.
If you’re finding things like existential dread concerning, I’ll flag that the numbers in this post are actually fairly low in the grand scheme of total risks to you over your life − 3.7% just isn’t that high. Dying young just isn’t that likely.
You know, even disregarding AI, I’d never have thought that I had a ~5% chance of dying in the next 30 years. It’s frightening.
I wouldn’t take this as bearing on the matter that you replied to in any way, though.
I’m sorry that the post made you uncomfortable, and appreciate you flagging this constructively. Responses in thread.
One reason I might be finding this post uncomfortable is the chart it’s centered around.
The medical information is based on real people who have died recently. It’s a forecast based on counting. We can have a lot of confidence in those numbers.
In contrast, the AI numbers are trying to predict something that’s never happened before. It’s worth trying to predict, but the numbers are very different, and we can’t have much confidence in them especially for one particular year.
It feels kind of misleading to try to put these two very different kinds of numbers side by side as if they’re directly comparable.
I fairly strongly disagree with this take on two counts:
The life expectancy numbers are not highly robust. They’re naively extrapolating out the current rate of death in the UK to the future. This is a pretty dodgy methodology! I’m assuming that medical technology won’t expand, that AI won’t accelerate biotech research, that longevity research doesn’t go anywhere, that we don’t have disasters like a much worse pandemic or nuclear war, that there won’t be new major public health hazards that disproportionately affect young people, that climate change won’t substantially affect life expectancy in the rich world, that there won’t be major enough wars to affect life expectancy in the UK, etc. The one thing that we know won’t happen in the future is the status quo.
I agree that it’s less dodgy than the AI numbers, but it’s still on a continuum, rather than some ontological difference between legit numbers and non-legit numbers.
Leaving that aside, I think it’s extremely reasonable to compare high confidence and low confidence numbers so long as they’re trying to measure the same thing. The key thing is that low confidence numbers aren’t low confidence in any particular direction (if they were, we’d change to a different estimate). Maybe the AI x-risk numbers are way higher, maybe they’re way lower. They’re definitely noisier, but the numbers mean fundamentally the same thing, and are directly comparable. And comparing numbers like this is part of the process of understanding the implications of your models of the future, even if they are fairly messy and uncertain models.
Of course, it’s totally reasonable to disagree with the models used for these questions and think that eg they have major systematic biases towards exaggerating AI probabilities. That should just give you different numbers to put into this model.
As a concrete example, I’d like governments to be able to compare the risks of a nuclear war to their citizens lives, vs other more mundane risks, and to figure out cost-effectiveness accordingly. Nuclear wars have never happened in something remotely comparable to today’s geopolitical climate, so any models here will be inherently uncertain and speculative, but it seems pretty important to be able to answer questions like this regardless.
Re 2 the right way to compare high and low confidence numbers is to add error bounds. This chart does not do that.
I disagree because I think error bounds over probabilities are less principled than a lot of people assume, and they can add a bunch of false confidence.
More false confidence than not mentioning error ranges at all?
Yes. Quantitive expressions of credal resilience is complicated, there isn’t a widely-shared-upon formulation, and a lot of people falsely assume that error bounds on made-up probabilities are more “rigorous” or “objective” than the probabilities themselves.
The issue is that by putting together high-confidence (relatively) and low-confidence estimates in your calculation, your resulting numbers should be low-confidence. For example, if your error bounds for AI risk vary by an order of magnitude each way (which is frankly insanely small for something this speculative) then the error bounds in your relative risk estimate would give you a value between 0.6% and 87%. With an error range like this, I don’t think the statement “my most likely reason to die young is AI x-risk” is justified.
Hmm. I agree that these numbers are low confidence. But for the purpose of acting and forming conclusions from this, I’m not sure what you think is a better approach (beyond saying that more resources should be put into becoming more confident, which I broadly agree with).
Do you think I can never make statements like “low confidence proposition X is more likely than high confidence proposition Y”? What would feel like a reasonable criteria for being able to say that kind of thing?
More generally, I’m not actually sure what you’re trying to capture with error bounds—what does it actually mean to say that P(AI X-risk) is in [0.5%, 50%] rather than 5%? What is this a probability distribution over? I’m estimating a probability, not a quantity. I’d be open to the argument that the uncertainty comes from ‘what might I think if I thought about this for much longer’.
I’ll also note that the timeline numbers are a distribution over years, which is already implicitly including a bunch of uncertainty plus some probability over AI never. Though obviously it could include more. The figure for AI x-risk is a point estimate, which is much dodgier.
And I’ll note again that the natural causes numbers are at best medium confidence, since they assume the status quo continues!
Nitpick: I think you mean 6%? (0.37/(0.37+5.3) = 0.06). Obviously this doesn’t change your core point.
Honestly, yeah, I think it is a weird statement to definitively state that X wildly speculative thing is more likely than Y well known and studied thing (or to put it differently, when the error bounds of X are orders of magnitude different from the error bounds in Y). It might help if you provided a counterexample here? I think my objections here might be partially on the semantics, saying “X is more likely than Y” seems like a smuggling of certainty into a very uncertain proposition.
I think it elucidates more accurately the state of knowledge about the situation, which is that you don’t know much at all.
(also, lol, fair point on the calculation error)
I don’t disagree with the premise that agreeing on empirical beliefs about AI probably matters more for whether someone does AI safety work than philosophical beliefs. I’ve made that argument before!