Agreed, I think this post provides a great insight that hasn’t been pointed out before, but it works best for the Carlsmith model which is unusually conjunctive. Arguments for disjunctive AI risk include Nate Soares here and Kokotajlo and Dai here.
Both of the links you suggest are strong philosophical arguments for ‘disjunctive’ risk, but are not actually model schema (although Soares does imply he has such a schema and just hasn’t published it yet). The fact that I only use Carlsmith to model risk is a fair reflection of the state of the literature.
(As an aside, this seems really weird to me—there is almost no community pressure to have people explicitly draw out their model schema in powerpoint or on a piece of paper or something. This seems like a fundamental first step in communicating about AI Risk, but only Carlsmith has really done it to an actionable level. Am I missing something here? Are community norms in AI Risk very different to community norms in health economics, which is where I usually do my modelling?)
Agreed on that as well. The Carlsmith report is the only quantitative model of AI risk I’m aware of and it was the right call to do this analysis on it. I think we do have reasonably large error bars on its parameters (though perhaps smaller than an order of magnitude) meaning your insight is important.
Why aren’t there more models? My guess is that it’s just very difficult, with lots of overlapping and entangled scenarios that are hard to tease apart. How would you go about constructing an overall x-risk from the list of disjunctive risks? You can’t assume they’re independent events, and generating conditional probabilities for each seems challenging and not necessarily helpful.
Ajeya Cotra’s BioAnchors report is another quantitative model of that drives lots of beliefs on AI timelines. Stephanie Lin won the EA Critique Contest with one critique, but I’d be curious if you’d have other concerns with it.
Agreed, I think this post provides a great insight that hasn’t been pointed out before, but it works best for the Carlsmith model which is unusually conjunctive. Arguments for disjunctive AI risk include Nate Soares here and Kokotajlo and Dai here.
Both of the links you suggest are strong philosophical arguments for ‘disjunctive’ risk, but are not actually model schema (although Soares does imply he has such a schema and just hasn’t published it yet). The fact that I only use Carlsmith to model risk is a fair reflection of the state of the literature.
(As an aside, this seems really weird to me—there is almost no community pressure to have people explicitly draw out their model schema in powerpoint or on a piece of paper or something. This seems like a fundamental first step in communicating about AI Risk, but only Carlsmith has really done it to an actionable level. Am I missing something here? Are community norms in AI Risk very different to community norms in health economics, which is where I usually do my modelling?)
Agreed on that as well. The Carlsmith report is the only quantitative model of AI risk I’m aware of and it was the right call to do this analysis on it. I think we do have reasonably large error bars on its parameters (though perhaps smaller than an order of magnitude) meaning your insight is important.
Why aren’t there more models? My guess is that it’s just very difficult, with lots of overlapping and entangled scenarios that are hard to tease apart. How would you go about constructing an overall x-risk from the list of disjunctive risks? You can’t assume they’re independent events, and generating conditional probabilities for each seems challenging and not necessarily helpful.
Ajeya Cotra’s BioAnchors report is another quantitative model of that drives lots of beliefs on AI timelines. Stephanie Lin won the EA Critique Contest with one critique, but I’d be curious if you’d have other concerns with it.