Thank you for reading and for your detailed comment. In general I would agree that my post is not a neutral survey of the VWH but a critical response, and I think I made that clear in the introduction even if I did not call it red-teaming explicitly.
I’d like to respond to some of the points you make.
“As Zach mentioned, I think you at least somewhat overstate the extent to which Bostrom is recommending as opposed to analyzing these interventions.”
I think this is overall unclear in Bostrom’s paper, but he does have a section called Policy Implications right at the top of the paper where he says “In order for civilization to have a general capacity to deal with “black ball” inventions of this type, it would need a system of ubiquitous real-time worldwide surveillance. In some scenarios, such a system would need to be in place before the technology is invented.” I think it is confusing because he starts out analyzing the urn of technology, then conditioned on there being black balls in the urn he recommends ubiquitous real-time worldwide surveillance, and then the ‘high-tech panopticon’ example is just one possible incarnation of that surveillance that he is analyzing. I think it is hard to deny that he is recommending the panopticon if existential risk prevention is the only value we’re measuring. He doesn’t claim all-things-considered support, but my response isn’t about other considerations of a panopticon. I don’t think a panopticon is any good even if existential risk is all we care about.
“You seem to argue (or at least give the vibe that) that there’s there’s so little value in trying to steer technological development for the better than we should mostly not bother and instead just charge ahead as fast as possible. “
I think this is true insofar as it goes, but you miss what is in my opinion the more important second part of the argument. Predicting the benefits of future tech is very difficult, but even if we knew all of that, getting the government to actually steer in the right direction is harder. For example, economists have known for centuries that domestic farming subsidies are inefficient. They are wasteful and they produce big negative externalities. But almost every country on earth has big domestic farming subsidies because they benefit a small, politically active group in most countries. I admit that we have some foreknowledge of which technologies look dangerous and which do not. That is far from sufficient for using the government to decrease risk.
The point of Enlightenment Values is not that no one should think about the risks of technology and we should all charge blindly forward. Rather, it is that decisions about how best to steer technology for the better can and should be made on the individual level where they are more voluntary, constrained by competition, and mistakes are hedged by lots of other people making different decisions.
“A core premise/argument in your post appears to be that pulling a black ball and an antidote (i.e., discovering a very dangerous technology and a technology that can protect us from it) at the same time means we’re safe. This seems false, and I think that substantially undermines the case for trying to rush forward and grab balls from the urn as fast as possible.”
There are technologies like engineered viruses and vaccines, but how they interact depends much more on their relative costs. An antidote to $5-per-infection viruses might need to be $1-per-dose vaccines or $0.5-per-mask PPE. If you just define an antidote to be “a technology which is powerful and cheap enough to counter the black ball should they be pulled simultaneously” then the premise stands.
“Do you (the reader) feel confident that everything will go well in that world where all possible techs and insights on dumped on us at once?”
Until meta-understanding of technology greatly improves this is ultimately a matter of opinion. If you think there exists some technology that is incompatible with civilization in all contexts then I can’t really prove you wrong but it doesn’t seem right to me.
Type-0 vulnerabilities were ‘surprising strangelets.’ Not techs that are incompatible with civilization in all contexts, but risks that come from unexpected phenomena like the Hadron Collider opening a black hole or something like that.
“I think the following bolded claim is false, and I think it’s very weird to make this empirical claim without providing any actual evidence for it: “AI safety researchers argue over the feasibility of ‘boxing’ AIs in virtual environments, or restricting them to act as oracles only, but they all agree that training an AI with access to 80+% of all human sense-data and connecting it with the infrastructure to call out armed soldiers to kill or imprison anyone perceived as dangerous would be a disaster.”
You’re right that I didn’t get any survey of AI researchers for this question. The near-tautological nature of “properly aligned superintelligence” guarantees that if we had it, everything would go well. So yeah, probably lots of AI researchers would agree that a properly aligned superintelligence would use surveillance to improve the world. This is a pretty empty statement imo. The question is about what we should do next. This hypothetical aligned intelligence tells us nothing about what increasing state AI surveillance capacity does on the margin. Note that Bostrom is not recommending that an aligned superintelligent-being do the surveillance. His recommendations are about increasing global governance and surveillance on the margin. The AI he mentions is just a machine learning classifier that can help a human government blur out the private parts the cameras collect.
“I’m just saying that thinking that increased surveillance, enforcement, moves towards global governance, etc. would be good doesn’t require thinking that permanent extreme levels (centralised in a single state-like entity) would be good.”
This is only true if you have a reliable way of taking back increased surveillance, enforcement, and moves towards global governance. The alignment and instrumental convergence problems I outlined in those sections give strong reasons why these capabilities are extremely difficult to take back. Bostrom scantly mentions the issue of getting governments to enact his risk reducing policies once they have the power to enforce them, let alone give a mechanism design which would judiciously use its power to guide us through the time of perils and then reliably step down. Without such a plan the issues of power-seeking and misalignment are not ones you can ignore
Thank you for reading and for your detailed comment. In general I would agree that my post is not a neutral survey of the VWH but a critical response, and I think I made that clear in the introduction even if I did not call it red-teaming explicitly.
I’d like to respond to some of the points you make.
“As Zach mentioned, I think you at least somewhat overstate the extent to which Bostrom is recommending as opposed to analyzing these interventions.”
I think this is overall unclear in Bostrom’s paper, but he does have a section called Policy Implications right at the top of the paper where he says “In order for civilization to have a general capacity to deal with “black ball” inventions of this type, it would need a system of ubiquitous real-time worldwide surveillance. In some scenarios, such a system would need to be in place before the technology is invented.” I think it is confusing because he starts out analyzing the urn of technology, then conditioned on there being black balls in the urn he recommends ubiquitous real-time worldwide surveillance, and then the ‘high-tech panopticon’ example is just one possible incarnation of that surveillance that he is analyzing. I think it is hard to deny that he is recommending the panopticon if existential risk prevention is the only value we’re measuring. He doesn’t claim all-things-considered support, but my response isn’t about other considerations of a panopticon. I don’t think a panopticon is any good even if existential risk is all we care about.
“You seem to argue (or at least give the vibe that) that there’s there’s so little value in trying to steer technological development for the better than we should mostly not bother and instead just charge ahead as fast as possible. “
I think this is true insofar as it goes, but you miss what is in my opinion the more important second part of the argument. Predicting the benefits of future tech is very difficult, but even if we knew all of that, getting the government to actually steer in the right direction is harder. For example, economists have known for centuries that domestic farming subsidies are inefficient. They are wasteful and they produce big negative externalities. But almost every country on earth has big domestic farming subsidies because they benefit a small, politically active group in most countries. I admit that we have some foreknowledge of which technologies look dangerous and which do not. That is far from sufficient for using the government to decrease risk.
The point of Enlightenment Values is not that no one should think about the risks of technology and we should all charge blindly forward. Rather, it is that decisions about how best to steer technology for the better can and should be made on the individual level where they are more voluntary, constrained by competition, and mistakes are hedged by lots of other people making different decisions.
“A core premise/argument in your post appears to be that pulling a black ball and an antidote (i.e., discovering a very dangerous technology and a technology that can protect us from it) at the same time means we’re safe. This seems false, and I think that substantially undermines the case for trying to rush forward and grab balls from the urn as fast as possible.”
There are technologies like engineered viruses and vaccines, but how they interact depends much more on their relative costs. An antidote to $5-per-infection viruses might need to be $1-per-dose vaccines or $0.5-per-mask PPE. If you just define an antidote to be “a technology which is powerful and cheap enough to counter the black ball should they be pulled simultaneously” then the premise stands.
“Do you (the reader) feel confident that everything will go well in that world where all possible techs and insights on dumped on us at once?”
Until meta-understanding of technology greatly improves this is ultimately a matter of opinion. If you think there exists some technology that is incompatible with civilization in all contexts then I can’t really prove you wrong but it doesn’t seem right to me.
Type-0 vulnerabilities were ‘surprising strangelets.’ Not techs that are incompatible with civilization in all contexts, but risks that come from unexpected phenomena like the Hadron Collider opening a black hole or something like that.
“I think the following bolded claim is false, and I think it’s very weird to make this empirical claim without providing any actual evidence for it: “AI safety researchers argue over the feasibility of ‘boxing’ AIs in virtual environments, or restricting them to act as oracles only, but they all agree that training an AI with access to 80+% of all human sense-data and connecting it with the infrastructure to call out armed soldiers to kill or imprison anyone perceived as dangerous would be a disaster.”
You’re right that I didn’t get any survey of AI researchers for this question. The near-tautological nature of “properly aligned superintelligence” guarantees that if we had it, everything would go well. So yeah, probably lots of AI researchers would agree that a properly aligned superintelligence would use surveillance to improve the world. This is a pretty empty statement imo. The question is about what we should do next. This hypothetical aligned intelligence tells us nothing about what increasing state AI surveillance capacity does on the margin. Note that Bostrom is not recommending that an aligned superintelligent-being do the surveillance. His recommendations are about increasing global governance and surveillance on the margin. The AI he mentions is just a machine learning classifier that can help a human government blur out the private parts the cameras collect.
“I’m just saying that thinking that increased surveillance, enforcement, moves towards global governance, etc. would be good doesn’t require thinking that permanent extreme levels (centralised in a single state-like entity) would be good.”
This is only true if you have a reliable way of taking back increased surveillance, enforcement, and moves towards global governance. The alignment and instrumental convergence problems I outlined in those sections give strong reasons why these capabilities are extremely difficult to take back. Bostrom scantly mentions the issue of getting governments to enact his risk reducing policies once they have the power to enforce them, let alone give a mechanism design which would judiciously use its power to guide us through the time of perils and then reliably step down. Without such a plan the issues of power-seeking and misalignment are not ones you can ignore