I think I agree with this general approach to thinking about this.
From what Iâve seen of AI risk discussions, I think Iâd stand by my prior statement, which Iâd paraphrase now as: There are a variety of different types of AI catastrophe scenario that have been discussed. Some seem like they might be more likely or similarly likely to totally wipe us out that to cause a 5-25% death toll. But some donât. And I havenât seen super strong arguments for considering the former much more likely than the latter. And it seems like the AI safety community as a whole has become more diverse in their thinking on this sort of thing over the last few years.
For engineered pandemics, it still seems to me that literally 100% of people dying from the pathogens themselves seems much less likely than a very high number dying, perhaps even enough to cause existential catastrophe slightly âindirectlyâ. However âwellâ engineered, pathogens themselves arenât agents which explicitly seek the complete extinction of humanity. (Again, Defence in Depth seems relevant here.) Though this is slightly different from a conversation about the relative likelihood of 10% vs other percentages. (Also, I feel hesitant to discuss this in great deal, for vague information hazards reasons.)
I agree regarding accidental physics risks. But I think the risks from those is far lower than the risks from AI and bio, and probably nanotech, nuclear, etc. (I donât really bring any independent evidence to the table; this is just based on the views Iâve seen from x-risk researchers.)
from the track record of known risks, it seems that probably there are many diverse unknown risks, and so probably at least a few of them do not have common mini-versions.
I think thatâd logically follow from your prior statements. But Iâm not strongly convinced about those statements, except regarding accidental physics risks, which seem very unlikely.
And by the argument you just gave, the âunknownâ risks that have common mini-versions wonât actually be unknown, since weâll see their mini-versions. So âunknownâ risks are going to be disproportionately the kind of risk that doesnât have common mini-versions.
I think this is an interesting point. It does tentatively update me towards thinking that, conditional on there indeed being âunknown risksâ that are already âin playâ, theyâre more likely than Iâd otherwise thing to jump straight to 100%, without âmini versionsâ.
However, I think the most concerning source of âunknown risksâ are new technologies or new actions (risks that arenât yet âin playâ). The unknown equivalents of risks from nanotech, space exploration, unprecedented consolidation of governments across the globe, etc. âDrawing a new ball from the urnâ, in Bostromâs metaphor. So even if such risks do have âcommon mini-versionsâ, we wouldnât yet have seen them.
Also, regarding the portion of unknown risks that are in play, it seems to be appropriate to respond to the argument âMost risks have common mini-versions, but we havenât seen these for unknown risks (pretty much by definition)â partly by updating towards thinking the unknown risks lack such common mini-versions, but also partly by updating towards thinking unknown risks are unlikely. We arenât forced to fully take the former interpretation.
Tobiasâ original point was â Also, if engineered pandemics, or âunforeseenâ and âotherâ anthropogenic risks have a chance of 3% each of causing extinction, wouldnât you expect to see smaller versions of these risks (that kill, say, 10% of people, but donât result in extinction) much more frequently? But we donât observe that. â
Thus he is saying there arenât any âunknownâ risks that do have common mini-versions but just havenât had time to develop yet. Thatâs way too strong a claim, I think. Perhaps in my argument against this claim I ended up making claims that were also too strong. But I think my central point is still right: Tobiasâ argument rules out things arising in the future that clearly shouldnât be ruled out, because if we had run that argument in the past it would have ruled out various things (e.g. AI, nukes, physics risks, and come to think of it even asteroid strikes and pandemics if we go far enough back in the past) that in fact happened.
1. I interpreted the original claimââwouldnât you expectââas being basically one in which observation X was evidence against hypothesis Y. Not conclusive evidence, just an update. I didnât interpret it as âruling things outâ (in a strong way) or saying that there arenât any unknown risks without common mini-versions (just that itâs less likely that there are than one would otherwise think). Note that his point seemed to be in defence of âOrdâs estimates seem too high to meâ, rather than âthe risks are 0âł.
2. I do think that Tobiasâs point, even interpreted that way, was probably too strong, or missing a key detail, in that the key sources of risks are probably emerging or new things, so we wouldnât expect to have observed their mini-versions yet. Though I do tentatively think Iâd expect to see mini-versions before the âfull thingâ, once the new things do start arising. (Iâm aware this is all pretty hand-wavey phrasing.)
3i. As I went into more in my other comment, I think the general expectation that weâll expect to see very small versions before and more often than small ones, which we expect to see before and more often than medium, which we expect to see before and more often than large, etc., probably wouldâve served well in the past. There was progressively more advanced tech before AI, and AI is progressively advancing more. There were progressively more advanced weapons, progressively more destructive wars, progressively larger numbers of nukes, etc. Iâd guess the biggest pandemics and asteroid strikes werenât the first, because the biggest are rare.
3ii. AI is the least clear of those examples, because:
(a) it seems like destruction from AI so far has been very minimal (handful of fatalities from driverless cars, the âflash crashâ, etc.), yet it seems plausible major destruction could occur in future
(b) we do have specific arguments, though of somewhat unclear strength, that the same AI might actively avoid causing any destruction for a while, and then suddenly seize decisive strategic advantage etc.
But on (a), I do think most relevant researchers would say the risk this month from AI is extremely low; the risks will rise in future as systems become more capable. So thereâs still time in which we may see mini-versions.
And on (b), Iâd consider that a case where a specific argument updates us away from a generally pretty handy prior that weâll see small things earlier and more often than extremely large things. And we also donât yet have super strong reason to believe that those arguments are really painting the right picture, as far as Iâm aware.
3iii. I think if we interpreted Tobiasâs point as something like âWeâll never see anything thatâs unlike the pastâ, then yes, of course thatâs ridiculous. So as I mentioned elsewhere, I think it partly depends on how we carve up reality, how we define things, etc. E.g., do we put nukes in a totally new bucket, or consider it a part of trends in weaponry/âwarfare/âexplosives?
But in any case, my interpretation of Tobiasâs point, where itâs just about it being unlikely to see extreme things before smaller versions, would seem to work with e.g. nukes, even if we put them in their own special categoryâweâd be surprise by the first nuke, but weâd indeed see thereâs one nuke before there are thousands, and there are two detonations on cities before thereâs a full-scale nuclear war (if there ever is one, which hopefully and plausibly there wonât be).
In general I think youâve thought this through more carefully than me so without having read all your points Iâm just gonna agree with you.
So yeah, I think the main problem with Tobiasâ original point was that unknown risks are probably mostly new things that havenât arisen yet and thus the lack of observed mini-versions of them is no evidence against them. But I still think itâs also true that some risks just donât have mini-versions, or rather are as likely or more likely to have big versions than mini-versions. I agree that most risks are not like this, including some of the examples I reached for initially.
I think I agree with this general approach to thinking about this.
From what Iâve seen of AI risk discussions, I think Iâd stand by my prior statement, which Iâd paraphrase now as: There are a variety of different types of AI catastrophe scenario that have been discussed. Some seem like they might be more likely or similarly likely to totally wipe us out that to cause a 5-25% death toll. But some donât. And I havenât seen super strong arguments for considering the former much more likely than the latter. And it seems like the AI safety community as a whole has become more diverse in their thinking on this sort of thing over the last few years.
For engineered pandemics, it still seems to me that literally 100% of people dying from the pathogens themselves seems much less likely than a very high number dying, perhaps even enough to cause existential catastrophe slightly âindirectlyâ. However âwellâ engineered, pathogens themselves arenât agents which explicitly seek the complete extinction of humanity. (Again, Defence in Depth seems relevant here.) Though this is slightly different from a conversation about the relative likelihood of 10% vs other percentages. (Also, I feel hesitant to discuss this in great deal, for vague information hazards reasons.)
I agree regarding accidental physics risks. But I think the risks from those is far lower than the risks from AI and bio, and probably nanotech, nuclear, etc. (I donât really bring any independent evidence to the table; this is just based on the views Iâve seen from x-risk researchers.)
I think thatâd logically follow from your prior statements. But Iâm not strongly convinced about those statements, except regarding accidental physics risks, which seem very unlikely.
I think this is an interesting point. It does tentatively update me towards thinking that, conditional on there indeed being âunknown risksâ that are already âin playâ, theyâre more likely than Iâd otherwise thing to jump straight to 100%, without âmini versionsâ.
However, I think the most concerning source of âunknown risksâ are new technologies or new actions (risks that arenât yet âin playâ). The unknown equivalents of risks from nanotech, space exploration, unprecedented consolidation of governments across the globe, etc. âDrawing a new ball from the urnâ, in Bostromâs metaphor. So even if such risks do have âcommon mini-versionsâ, we wouldnât yet have seen them.
Also, regarding the portion of unknown risks that are in play, it seems to be appropriate to respond to the argument âMost risks have common mini-versions, but we havenât seen these for unknown risks (pretty much by definition)â partly by updating towards thinking the unknown risks lack such common mini-versions, but also partly by updating towards thinking unknown risks are unlikely. We arenât forced to fully take the former interpretation.
Tobiasâ original point was â Also, if engineered pandemics, or âunforeseenâ and âotherâ anthropogenic risks have a chance of 3% each of causing extinction, wouldnât you expect to see smaller versions of these risks (that kill, say, 10% of people, but donât result in extinction) much more frequently? But we donât observe that. â
Thus he is saying there arenât any âunknownâ risks that do have common mini-versions but just havenât had time to develop yet. Thatâs way too strong a claim, I think. Perhaps in my argument against this claim I ended up making claims that were also too strong. But I think my central point is still right: Tobiasâ argument rules out things arising in the future that clearly shouldnât be ruled out, because if we had run that argument in the past it would have ruled out various things (e.g. AI, nukes, physics risks, and come to think of it even asteroid strikes and pandemics if we go far enough back in the past) that in fact happened.
1. I interpreted the original claimââwouldnât you expectââas being basically one in which observation X was evidence against hypothesis Y. Not conclusive evidence, just an update. I didnât interpret it as âruling things outâ (in a strong way) or saying that there arenât any unknown risks without common mini-versions (just that itâs less likely that there are than one would otherwise think). Note that his point seemed to be in defence of âOrdâs estimates seem too high to meâ, rather than âthe risks are 0âł.
2. I do think that Tobiasâs point, even interpreted that way, was probably too strong, or missing a key detail, in that the key sources of risks are probably emerging or new things, so we wouldnât expect to have observed their mini-versions yet. Though I do tentatively think Iâd expect to see mini-versions before the âfull thingâ, once the new things do start arising. (Iâm aware this is all pretty hand-wavey phrasing.)
3i. As I went into more in my other comment, I think the general expectation that weâll expect to see very small versions before and more often than small ones, which we expect to see before and more often than medium, which we expect to see before and more often than large, etc., probably wouldâve served well in the past. There was progressively more advanced tech before AI, and AI is progressively advancing more. There were progressively more advanced weapons, progressively more destructive wars, progressively larger numbers of nukes, etc. Iâd guess the biggest pandemics and asteroid strikes werenât the first, because the biggest are rare.
3ii. AI is the least clear of those examples, because:
(a) it seems like destruction from AI so far has been very minimal (handful of fatalities from driverless cars, the âflash crashâ, etc.), yet it seems plausible major destruction could occur in future
(b) we do have specific arguments, though of somewhat unclear strength, that the same AI might actively avoid causing any destruction for a while, and then suddenly seize decisive strategic advantage etc.
But on (a), I do think most relevant researchers would say the risk this month from AI is extremely low; the risks will rise in future as systems become more capable. So thereâs still time in which we may see mini-versions.
And on (b), Iâd consider that a case where a specific argument updates us away from a generally pretty handy prior that weâll see small things earlier and more often than extremely large things. And we also donât yet have super strong reason to believe that those arguments are really painting the right picture, as far as Iâm aware.
3iii. I think if we interpreted Tobiasâs point as something like âWeâll never see anything thatâs unlike the pastâ, then yes, of course thatâs ridiculous. So as I mentioned elsewhere, I think it partly depends on how we carve up reality, how we define things, etc. E.g., do we put nukes in a totally new bucket, or consider it a part of trends in weaponry/âwarfare/âexplosives?
But in any case, my interpretation of Tobiasâs point, where itâs just about it being unlikely to see extreme things before smaller versions, would seem to work with e.g. nukes, even if we put them in their own special categoryâweâd be surprise by the first nuke, but weâd indeed see thereâs one nuke before there are thousands, and there are two detonations on cities before thereâs a full-scale nuclear war (if there ever is one, which hopefully and plausibly there wonât be).
In general I think youâve thought this through more carefully than me so without having read all your points Iâm just gonna agree with you.
So yeah, I think the main problem with Tobiasâ original point was that unknown risks are probably mostly new things that havenât arisen yet and thus the lack of observed mini-versions of them is no evidence against them. But I still think itâs also true that some risks just donât have mini-versions, or rather are as likely or more likely to have big versions than mini-versions. I agree that most risks are not like this, including some of the examples I reached for initially.