Like if youâre contemplating running a fellowship program for AI interested people, and you have animals in your moral circle, youâre going to have to build this botec that includes the probability an X% of the people you bring into the fellowship are not going to care about animals and likely, if they get a policy role, to pass policies that are really bad for them...
...I sort of suspect that only a handful of people are trying to do this, and I get why! I made a reasonably straightforward botec for calculating the benefits to birds of bird-safe glass, that accounted for backfire to birds, and it took a lot of research effort. If you asked me how bird-safe glass policy is going to affect AI risk after all that, I might throw my computer at you. But I think the precise probabilities approach would imply that I should.
Just purely on the descriptive level and not the normative one â
I agree but even more strongly: in AI safety Iâve basically never seen a BOTEC this detailed. I think Eric Neymanâs BOTEC of the cost-effectiveness of donating to congressional candidate Alex Bores is a good public example of the type of analysis common in EA-driven AI safety work: it bottoms out in pretty general goods like âgovernment action on AI safetyâ and does not try to model second-order effects to the degree described here. It doesnât model even considerations like âwhat if AI safety legislation is passed, but that legislation backfires by increasing polarization on the issue?â let alone anything about animals.
Instead, this kind of strategic discussion tends to be qualitative, and is hashed out in huge blocks of prose and comment threads e.g. on LessWrong, or verbally.
I sort of wonder if some people in the AI communityâany maybe you, from what youâve said here? -- are using precise probabilities to get to the conclusion that you want to work primarily on AI stuff, and then spotlighting to that cause area when youâre analyzing at the level of interventions.
I see why you describe it this way, and this directionally this seems right. But, what we do doesnât really sound like âspotlightingâ as you describe it in the post: focusing on specific moral patient groups and explicitly setting aside others.
Essentially I think the epistemic framework we use is just more anarchic and freeform than that! In AIS discourse, it feels like âbut this intervention could slow down the US relative to Chinaâ or âbut this intervention could backfire by increasing polarizationâ or âbut this intervention could be bad for animalsâ exist at the same epistemic level, and all are considered valid points to raise.
(I do think that there is a significant body of orthodox AI safety thought which takes particular stances on each of these issues and other issues, which in a lot of contexts likely makes various points feel like not âvalidâ to raise. I think this is unfortunate.)
Maybe itâs similar to the difference between philosophy and experimental science, where in philosophy a lot of discourse is fundamentally unstructured and qualitative, and in the experimental sciences there is much more structure because any contribution needs to be an empirical experiment, and there are specific norms and formats for those, which have certain implications for how second-order effects are or arenât considered. AI safety discourse also feels similar at times to wonk-ish policy discourse.
(Within certain well-scoped sub-areas of AI safety things are less epistemically anarchic; e.g. research into AI interpretability usually needs empirical results if itâs to be taken seriously.)
I think someone using precise probabilities all the way down is building a lot more explicit models every time they consider a specific intervention. Like if youâre contemplating running a fellowship program for AI interested people, and you have animals in your moral circle, youâre going to have to build this botec that includes the probability an X% of the people you bring into the fellowship are not going to care about animals and likely, if they get a policy role, to pass policies that are really bad for them. And all sorts of things like that. So your output would be a bunch of hypotheses about exactly how these fellows are going to benefit AI policy, and some precise probabilities about how those policy benefits are going to help people, and possibly animals to what degree, etc.
Hmm, I wouldnât agree that someone using precise probabilities âall the way downâ is necessarily building these kind of explicit models. I wonder if the term âprecise probabilitiesâ is being understood differently in our two areas.
In the Bayesian epistemic style that EA x AI safety has, itâs felt that anyone can attach precise probabilities to their beliefs with ~no additional thought, and that these probabilities are subjective things which may not be backed by any kind of explicit or even externally legible model. Thereâs a huge focus on probabilities as betting odds, and betting odds donât require such things (diverging notably from how probabilities are used in science).
I mean, I think typically people have something to say to justify their beliefs, but this can be & often is something as high-level as âit seems good if AGI companies are required to be more transparent about their safety practices,â with little in the way of explicit models about downstream effects thereof.[1]
Apologies for not responding to some of the other threads in your post, ran out of time; looking forward to discussing in person sometime.
While itâs common for AI safety people to agree with my statement about transparency here, some may flatly disagree (i.e. disagree about sign), and others (more commonly) may disagree massively about the magnitude of the effect. There are many verbal arguments but relatively few explicit models to adjudicate these disputes.
All very interesting, and yes letâs talk more later!
One quick thing: Sorry my comment was unclearâwhen I said âprecise probabilitiesâ I meant the overall approach, which amounts to trying to quantify everything about an intervention when deciding its cost effectiveness (perhaps the post was also unclear).
I think most people in EA/âAW spaces use the general term âprecise probabilitiesâ the same way youâre describing, but perhaps there is on average a tendency toward the more scientific style of needing more specific evidence for those numbers. That wasnât necessarily true of early actors in the WAW space and I think it had some mildly unfortunate consequences.
But this makes me realize I should not have named the approach that way in the original post, and should have called it something like the âquantify as much as possibleâ approach. I think that approach requires using precise probabilitiesâsince if you allow imprecise ones you end up with a lot of things being indeterminateâbut thereâs more to it than just endorsing precise probabilities over imprecise ones (at least as Iâve seen it appear in WAW).
Just purely on the descriptive level and not the normative one â
I agree but even more strongly: in AI safety Iâve basically never seen a BOTEC this detailed. I think Eric Neymanâs BOTEC of the cost-effectiveness of donating to congressional candidate Alex Bores is a good public example of the type of analysis common in EA-driven AI safety work: it bottoms out in pretty general goods like âgovernment action on AI safetyâ and does not try to model second-order effects to the degree described here. It doesnât model even considerations like âwhat if AI safety legislation is passed, but that legislation backfires by increasing polarization on the issue?â let alone anything about animals.
Instead, this kind of strategic discussion tends to be qualitative, and is hashed out in huge blocks of prose and comment threads e.g. on LessWrong, or verbally.
I see why you describe it this way, and this directionally this seems right. But, what we do doesnât really sound like âspotlightingâ as you describe it in the post: focusing on specific moral patient groups and explicitly setting aside others.
Essentially I think the epistemic framework we use is just more anarchic and freeform than that! In AIS discourse, it feels like âbut this intervention could slow down the US relative to Chinaâ or âbut this intervention could backfire by increasing polarizationâ or âbut this intervention could be bad for animalsâ exist at the same epistemic level, and all are considered valid points to raise.
(I do think that there is a significant body of orthodox AI safety thought which takes particular stances on each of these issues and other issues, which in a lot of contexts likely makes various points feel like not âvalidâ to raise. I think this is unfortunate.)
Maybe itâs similar to the difference between philosophy and experimental science, where in philosophy a lot of discourse is fundamentally unstructured and qualitative, and in the experimental sciences there is much more structure because any contribution needs to be an empirical experiment, and there are specific norms and formats for those, which have certain implications for how second-order effects are or arenât considered. AI safety discourse also feels similar at times to wonk-ish policy discourse.
(Within certain well-scoped sub-areas of AI safety things are less epistemically anarchic; e.g. research into AI interpretability usually needs empirical results if itâs to be taken seriously.)
Hmm, I wouldnât agree that someone using precise probabilities âall the way downâ is necessarily building these kind of explicit models. I wonder if the term âprecise probabilitiesâ is being understood differently in our two areas.
In the Bayesian epistemic style that EA x AI safety has, itâs felt that anyone can attach precise probabilities to their beliefs with ~no additional thought, and that these probabilities are subjective things which may not be backed by any kind of explicit or even externally legible model. Thereâs a huge focus on probabilities as betting odds, and betting odds donât require such things (diverging notably from how probabilities are used in science).
I mean, I think typically people have something to say to justify their beliefs, but this can be & often is something as high-level as âit seems good if AGI companies are required to be more transparent about their safety practices,â with little in the way of explicit models about downstream effects thereof.[1]
Apologies for not responding to some of the other threads in your post, ran out of time; looking forward to discussing in person sometime.
While itâs common for AI safety people to agree with my statement about transparency here, some may flatly disagree (i.e. disagree about sign), and others (more commonly) may disagree massively about the magnitude of the effect. There are many verbal arguments but relatively few explicit models to adjudicate these disputes.
All very interesting, and yes letâs talk more later!
One quick thing: Sorry my comment was unclearâwhen I said âprecise probabilitiesâ I meant the overall approach, which amounts to trying to quantify everything about an intervention when deciding its cost effectiveness (perhaps the post was also unclear).
I think most people in EA/âAW spaces use the general term âprecise probabilitiesâ the same way youâre describing, but perhaps there is on average a tendency toward the more scientific style of needing more specific evidence for those numbers. That wasnât necessarily true of early actors in the WAW space and I think it had some mildly unfortunate consequences.
But this makes me realize I should not have named the approach that way in the original post, and should have called it something like the âquantify as much as possibleâ approach. I think that approach requires using precise probabilitiesâsince if you allow imprecise ones you end up with a lot of things being indeterminateâbut thereâs more to it than just endorsing precise probabilities over imprecise ones (at least as Iâve seen it appear in WAW).