[I’m not sure if you’ve thought about the following sort of question much. Also, I haven’t properly read your report—let me know if this is covered in there.]
I’m interested in a question along the lines of “Do you think some work done before TAI is developed matters in a predictable way—i.e., better than 0 value in expectation—for its effects on the post-TAI world, in ways that don’t just flow through how the work affects the pre-TAI world or how the TAI transition itself plays out? If so, to what extent? And what sort of work?”
An example to illustrate: “Let’s say TAI is developed in 2050, and the ‘TAI transition’ is basically ‘done’ by 2060. Could some work to improve institutional decision-making be useful in terms of how it affects what happens from 2060 onwards, and not just via reducing x-risk (or reducing suffering etc.) before 2060 and improving how the TAI transition goes?”
But I’m not sure it’s obvious what I mean by the above, so here’s my attempt to explain:
The question of when TAI will be developed[1] is clearly very important to a whole bunch of prioritisation questions. One reason is that TAI—and probably the systems leading up to it—will very substantially change how many aspects of how society works. Specifically, Open Phil has defined TAI as “AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution” (and Muehlhauser has provided some more detail on what is meant by that).
But I think some EAs implicitly assume something stronger, along the lines of:
The expected moral value of actions we take now is entirely based on those actions’ effects on what happens before TAI is developed and those actions’ effects on the development, deployment, etc. of TAI. That is, the expected value of the actions we take now is not partly based on how the actions affect aspects of the post-TAI world in ways unrelated to how TAI is developed, deployed, etc. This is either because we just can’t at all predict those effects or because those effects wouldn’t be important; the world will just be very shaken up and perhaps unrecognisable, and any effects of pre-TAI actions will be washed out unless they affect how the TAI transition occurs.
E.g., things we do now to improve institutional decision-making or reduce risks of war can matter inasmuch as they reduce risks before TAI and reduce risks from TAI (and maybe also reduce actual harms, increase benefits, etc.). But they’ll have no even-slightly-predictable or substantial effect on decision-making or risks of war in the post-TAI world.
But I don’t think that necessarily follows from how TAI is defined. E.g., various countries, religious, ideologies, political systems, technologies, etc., existed both before the Industrial Revolution and for decades/centuries afterwards. And it seems like some pre-Industrial-Revolution actions—e.g. people who pushed for democracy or the abolition of slavery—had effects on the post-Industrial-Revolution world that were probably predictably positive in advance and that weren’t just about affecting how the Industrial Revolution itself occurred.
(Though it may have still been extremely useful for people taking those actions to know that, when, where, and how the IR would occur, e.g. because then they could push for democracy and abolition in the countries that were about to become much more influential and powerful.)
So I’m tentatively inclined to think that some EAs are assuming that short timelines pushes against certain types of work more than it really does, and that certain (often “broad”) interventions could be in expectation useful for influencing the post-TAI world in a relatively “continuous” way. In other words, I’m inclined to thinks there might be less of an extremely abrupt “break” than some people seem to think, even if TAI occurs. (Though it’d still be quite extreme by many standards, just as the Industrial Revolution was.)
[1] Here I’m assuming TAI will be developed, which is questionable, though it seems to me pretty much guaranteed unless some existential catastrophe occurs beforehand.
I haven’t thought very deeply about this, but my first intuition is that the most compelling reason to expect to have an impact that predictably lasts longer than several hundred years without being washed out is because of the possibility of some sort of “lock-in”—technology that allows values and preferences to be more stably transmitted into the very long-term future than current technology allows. For example, the ability to program space probes with instructions for creating the type of “digital life” we would morally value, with error-correcting measures to prevent drift, would count as a technology that allows for effective lock-in in my mind.
A lot of people may act as if we can’t impact anything post-transformative AI because they believe technology that enables lock-in will be built very close in time after transformative AI (since TAI would likely cause R&D towards these types of tech to be greatly accelerated).
[Kind-of thinking aloud; bit of a tangent from your AMA]
Yeah, that basically matches my views.
I guess what I have in mind is that some people seem to:
round up “most compelling reason” to “only reason”
not consider the idea of trying to influence lock-in events that occur after a TAI transition, in ways other than influencing how the TAI transition itself occurs
Such ways could include things like influencing political systems in long-lasting ways
round up “substantial chance that technology that enables lock-in will be built very close in time after TAI” up to “it’s basically guaranteed that...”
I think what concerns me about this is that I get the impression many of people are doing this without noticing it. It seems like maybe some thought leaders recognised that there were questions to ask here, thought about the questions, and formed conclusions, but then other people just got a slightly simplified version of the conclusion without noticing there’s even a question to ask.
A counterpoint is that I think the ideas of “broad longtermism”, and some ideas that people like MacAskill have raised, kind-of highlight the questions I’m suggesting should be highlighted. But even those ideas seem to often be about what to do given the premise that a TAI transition won’t occur for a long time, or how to indirectly influence how a TAI transition occurs. So I think they’re still not exactly about the sort of thing I’m talking about.
To be clear, I do think we should put more longtermist resources towards influencing potential lock-in events prior to or right around the time of a TAI transition than towards non-TAI-focused ways of influencing events after a TAI transition. But it seems pretty plausible to me that some longtermist resources should go towards other things, and it also seems good for people to be aware that a debate could be had on this.
(I should probably think more about this, check whether similar points are already covered well in some existing writings, and if not write something more coherent that these comments.)
[I’m not sure if you’ve thought about the following sort of question much. Also, I haven’t properly read your report—let me know if this is covered in there.]
I’m interested in a question along the lines of “Do you think some work done before TAI is developed matters in a predictable way—i.e., better than 0 value in expectation—for its effects on the post-TAI world, in ways that don’t just flow through how the work affects the pre-TAI world or how the TAI transition itself plays out? If so, to what extent? And what sort of work?”
An example to illustrate: “Let’s say TAI is developed in 2050, and the ‘TAI transition’ is basically ‘done’ by 2060. Could some work to improve institutional decision-making be useful in terms of how it affects what happens from 2060 onwards, and not just via reducing x-risk (or reducing suffering etc.) before 2060 and improving how the TAI transition goes?”
But I’m not sure it’s obvious what I mean by the above, so here’s my attempt to explain:
The question of when TAI will be developed[1] is clearly very important to a whole bunch of prioritisation questions. One reason is that TAI—and probably the systems leading up to it—will very substantially change how many aspects of how society works. Specifically, Open Phil has defined TAI as “AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution” (and Muehlhauser has provided some more detail on what is meant by that).
But I think some EAs implicitly assume something stronger, along the lines of:
But I don’t think that necessarily follows from how TAI is defined. E.g., various countries, religious, ideologies, political systems, technologies, etc., existed both before the Industrial Revolution and for decades/centuries afterwards. And it seems like some pre-Industrial-Revolution actions—e.g. people who pushed for democracy or the abolition of slavery—had effects on the post-Industrial-Revolution world that were probably predictably positive in advance and that weren’t just about affecting how the Industrial Revolution itself occurred.
(Though it may have still been extremely useful for people taking those actions to know that, when, where, and how the IR would occur, e.g. because then they could push for democracy and abolition in the countries that were about to become much more influential and powerful.)
So I’m tentatively inclined to think that some EAs are assuming that short timelines pushes against certain types of work more than it really does, and that certain (often “broad”) interventions could be in expectation useful for influencing the post-TAI world in a relatively “continuous” way. In other words, I’m inclined to thinks there might be less of an extremely abrupt “break” than some people seem to think, even if TAI occurs. (Though it’d still be quite extreme by many standards, just as the Industrial Revolution was.)
[1] Here I’m assuming TAI will be developed, which is questionable, though it seems to me pretty much guaranteed unless some existential catastrophe occurs beforehand.
I haven’t thought very deeply about this, but my first intuition is that the most compelling reason to expect to have an impact that predictably lasts longer than several hundred years without being washed out is because of the possibility of some sort of “lock-in”—technology that allows values and preferences to be more stably transmitted into the very long-term future than current technology allows. For example, the ability to program space probes with instructions for creating the type of “digital life” we would morally value, with error-correcting measures to prevent drift, would count as a technology that allows for effective lock-in in my mind.
A lot of people may act as if we can’t impact anything post-transformative AI because they believe technology that enables lock-in will be built very close in time after transformative AI (since TAI would likely cause R&D towards these types of tech to be greatly accelerated).
[Kind-of thinking aloud; bit of a tangent from your AMA]
Yeah, that basically matches my views.
I guess what I have in mind is that some people seem to:
round up “most compelling reason” to “only reason”
not consider the idea of trying to influence lock-in events that occur after a TAI transition, in ways other than influencing how the TAI transition itself occurs
Such ways could include things like influencing political systems in long-lasting ways
round up “substantial chance that technology that enables lock-in will be built very close in time after TAI” up to “it’s basically guaranteed that...”
I think what concerns me about this is that I get the impression many of people are doing this without noticing it. It seems like maybe some thought leaders recognised that there were questions to ask here, thought about the questions, and formed conclusions, but then other people just got a slightly simplified version of the conclusion without noticing there’s even a question to ask.
A counterpoint is that I think the ideas of “broad longtermism”, and some ideas that people like MacAskill have raised, kind-of highlight the questions I’m suggesting should be highlighted. But even those ideas seem to often be about what to do given the premise that a TAI transition won’t occur for a long time, or how to indirectly influence how a TAI transition occurs. So I think they’re still not exactly about the sort of thing I’m talking about.
To be clear, I do think we should put more longtermist resources towards influencing potential lock-in events prior to or right around the time of a TAI transition than towards non-TAI-focused ways of influencing events after a TAI transition. But it seems pretty plausible to me that some longtermist resources should go towards other things, and it also seems good for people to be aware that a debate could be had on this.
(I should probably think more about this, check whether similar points are already covered well in some existing writings, and if not write something more coherent that these comments.)