Thanks for this Rory, I’m excited to see what else you have to say on this topic.
One thing I think this post is missing is a more detailed response to the ‘ideal governance as weird’ criticism. You write that ‘weird ideal governance theories may well be ineffective’, but I would suggest that almost all fleshed-out theories of ideal AI governance will be inescapablyweird, because most plausible post-trasformative AI worlds are deeply unfamiliar by nature.
A good intuition pump for this is to consider how weird modern Western society would seem to people from 1,000 years ago. We currently live in secular market-based democratic states run by a multiracial, multigender coalition of individuals whose primary form of communication is the instantaneous exchange of text via glowing, beeping machines. If you went back in time and tried to explain this world to an inhabitant of a mediaeval European theocratic monarchy, even to a member of the educated elite, they would be utterly baffled. How could society maintain order if the head of state was not blue-blooded and divinely ordained? How could peasants (particularly female ones) even learn to read and write, let alone effectively perform intellectual jobs? How could a society so dependent on usury avoid punishment by God in the form of floods, plagues or famines?
Even on the most conservative assumptions about AI capabilities, we can expect advanced AI to transform society at least as much as it has changed in the last 1,000 years. At a minimum, it promises to eliminate most productive employment, significantly extend our lifetimes, allow us to intricately surveil each and every member of society, and to drastically increase the material resources available to each person. A world with these four changes alone seems radically different and unfamiliar to our own, meaning any theory about its governance is going to seem weird. Throw in ideas like digital people and space colonisation and you’re jumping right off the weirdness deep end.
Of course, weirdness isn’t per se a reason not to go ahead with investigation into this topic, but I think the Wildeford post you cited is on the right track when it comes to weirdness points. AI Safety and Governance already struggles for respectability, so if you’re advocating for more EA resources to be dedicated to the area I think you need to give a more thorough justification for why it won’t discredit the field.
Thanks for this comment John. Briefly, one related thought I have is that weirdness is an important concern but that not all AI ideal governance theories are necessarily so weird that they’re self-defeating. I’m less concerned that theories that consider what institutions/norms should look like in the nearer future (such as around the future of work) are too weird, for example.
Broadly, I think your comment reinforces an important concern, and that further research on this topic would benefit from being mindful of the purpose it is trying to serve and its intended audience.
First—if you’re talking about nearer-term questions, like ‘What’s the right governance structure for a contemporary AI developer to ensure its board acts in the common interest?’ or ‘How can we help workers reskill after being displaced from the transport industry’ then I agree that doesn’t seem too strange. However, I don’t see how this would differ from the work that folks at places LPP and Gov.AI are doing already.
Second—if you’re talking about longer-term ideal governance questions, I reckon even relatively mundane topics are likely to seem pretty weird when studied in a longtermist context, because the bottom line for researchers will be how contemporary governance affects future generations.
To use your example of the future of work, an important question in that topic might be whether and when we should attribute legal personhood to digital labourers, with the bottom line concerning the effect of any such policy on the moral expansiveness of future societies. The very act of supposing that digital workers as smart as humans will one day exist is relatively weird, let alone considering their legal status, let further alone discussing the potential ethics of a digital civilisation.
This is of course a single, cherry-picked example, but I think that most papers justifying specific positive visions of the future will need to consider the impact of these intermediate positive worlds on the longterm future, which will appear weird and uncomfortably utopian. Meanwhile, I suspect that work with a negative focus (‘How can we prevent an arms race with China?’) or a more limited scope (‘How can we use data protection regulations to prevent bad actors from accessing sensitive datasets?’) doesn’t require this sort of abstract speculation, suggesting that research into ideal AI governance carries reputational hazards that others forms of safety/governance work do not. I’m particularly concerned that this will open up AI governance to more hit-pieces of this variety, turning off potential collaborators whose first interaction with longtermism is bad faith critique.
Thanks again for your comment. Two quick related points:
People at the places you mention are definitely already doing interesting work relevant to ideal theory, e.g., regarding institutional design. Something distinctive about AI ideal governance research that I do think is less common is consideration of the normative components of AI governance issues.
On reflection, your comments and examples have convinced me that in the original post I didn’t take the ‘weirdness problem’ seriously enough. Although I’d guess we might still have a slight disagreement about the scope (and possibly the implications) of the problem, I certainly see that it is particularly salient for longtermists at the moment given the debates around the publication of Will MacAskill’s new book. As an example of ideal governance research that considers longer-term issues (including some ‘weird’ ones) in an analytical manner, Nick Bostrom, Allan Dafoe and Carrick Flynn’s paper on ‘Public Policy and Superintelligent AI’ may be of interest.
Okay. Thanks for clarifying that for me—I think we agree more than I expected, because I’m pretty in favour of their institutional design work.
I think you’re right that we have a disagreement w/r/t scope and implications, but it’s not clear to me to what extent this is also just a difference in ‘vibe’ which might dissolve if we discussed specific implications. In any case, I’ll take a look at that paper.
Thanks for this Rory, I’m excited to see what else you have to say on this topic.
One thing I think this post is missing is a more detailed response to the ‘ideal governance as weird’ criticism. You write that ‘weird ideal governance theories may well be ineffective’, but I would suggest that almost all fleshed-out theories of ideal AI governance will be inescapably weird, because most plausible post-trasformative AI worlds are deeply unfamiliar by nature.
A good intuition pump for this is to consider how weird modern Western society would seem to people from 1,000 years ago. We currently live in secular market-based democratic states run by a multiracial, multigender coalition of individuals whose primary form of communication is the instantaneous exchange of text via glowing, beeping machines. If you went back in time and tried to explain this world to an inhabitant of a mediaeval European theocratic monarchy, even to a member of the educated elite, they would be utterly baffled. How could society maintain order if the head of state was not blue-blooded and divinely ordained? How could peasants (particularly female ones) even learn to read and write, let alone effectively perform intellectual jobs? How could a society so dependent on usury avoid punishment by God in the form of floods, plagues or famines?
Even on the most conservative assumptions about AI capabilities, we can expect advanced AI to transform society at least as much as it has changed in the last 1,000 years. At a minimum, it promises to eliminate most productive employment, significantly extend our lifetimes, allow us to intricately surveil each and every member of society, and to drastically increase the material resources available to each person. A world with these four changes alone seems radically different and unfamiliar to our own, meaning any theory about its governance is going to seem weird. Throw in ideas like digital people and space colonisation and you’re jumping right off the weirdness deep end.
Of course, weirdness isn’t per se a reason not to go ahead with investigation into this topic, but I think the Wildeford post you cited is on the right track when it comes to weirdness points. AI Safety and Governance already struggles for respectability, so if you’re advocating for more EA resources to be dedicated to the area I think you need to give a more thorough justification for why it won’t discredit the field.
Thanks for this comment John. Briefly, one related thought I have is that weirdness is an important concern but that not all AI ideal governance theories are necessarily so weird that they’re self-defeating. I’m less concerned that theories that consider what institutions/norms should look like in the nearer future (such as around the future of work) are too weird, for example.
Broadly, I think your comment reinforces an important concern, and that further research on this topic would benefit from being mindful of the purpose it is trying to serve and its intended audience.
I have a couple thoughts on this.
First—if you’re talking about nearer-term questions, like ‘What’s the right governance structure for a contemporary AI developer to ensure its board acts in the common interest?’ or ‘How can we help workers reskill after being displaced from the transport industry’ then I agree that doesn’t seem too strange. However, I don’t see how this would differ from the work that folks at places LPP and Gov.AI are doing already.
Second—if you’re talking about longer-term ideal governance questions, I reckon even relatively mundane topics are likely to seem pretty weird when studied in a longtermist context, because the bottom line for researchers will be how contemporary governance affects future generations.
To use your example of the future of work, an important question in that topic might be whether and when we should attribute legal personhood to digital labourers, with the bottom line concerning the effect of any such policy on the moral expansiveness of future societies. The very act of supposing that digital workers as smart as humans will one day exist is relatively weird, let alone considering their legal status, let further alone discussing the potential ethics of a digital civilisation.
This is of course a single, cherry-picked example, but I think that most papers justifying specific positive visions of the future will need to consider the impact of these intermediate positive worlds on the longterm future, which will appear weird and uncomfortably utopian. Meanwhile, I suspect that work with a negative focus (‘How can we prevent an arms race with China?’) or a more limited scope (‘How can we use data protection regulations to prevent bad actors from accessing sensitive datasets?’) doesn’t require this sort of abstract speculation, suggesting that research into ideal AI governance carries reputational hazards that others forms of safety/governance work do not. I’m particularly concerned that this will open up AI governance to more hit-pieces of this variety, turning off potential collaborators whose first interaction with longtermism is bad faith critique.
Thanks again for your comment. Two quick related points:
People at the places you mention are definitely already doing interesting work relevant to ideal theory, e.g., regarding institutional design. Something distinctive about AI ideal governance research that I do think is less common is consideration of the normative components of AI governance issues.
On reflection, your comments and examples have convinced me that in the original post I didn’t take the ‘weirdness problem’ seriously enough. Although I’d guess we might still have a slight disagreement about the scope (and possibly the implications) of the problem, I certainly see that it is particularly salient for longtermists at the moment given the debates around the publication of Will MacAskill’s new book. As an example of ideal governance research that considers longer-term issues (including some ‘weird’ ones) in an analytical manner, Nick Bostrom, Allan Dafoe and Carrick Flynn’s paper on ‘Public Policy and Superintelligent AI’ may be of interest.
Taking each of your points in turn:
Okay. Thanks for clarifying that for me—I think we agree more than I expected, because I’m pretty in favour of their institutional design work.
I think you’re right that we have a disagreement w/r/t scope and implications, but it’s not clear to me to what extent this is also just a difference in ‘vibe’ which might dissolve if we discussed specific implications. In any case, I’ll take a look at that paper.