I’m unsure whether this is just a nitpick or too much of a personal take, but which precise version of the orthogonality thesis goes through has little effect on how worried I am about AGI, and I’m worried that the nonexpert reader this article is meant for will come away thinking that it does.
The argument for worry about AGI is, in my mind, carried by:
Cooperation failures in multipolar AGI takeoffs, and
Instrumental convergence around self-preservation (and maybe resource acquisition but not sure),
combined with the consideration that it might all go well a thousand times, but when the 1042nd AGI is started, it might not. And I imagine that once AGIs are useful, lots of them will be started by many different actors.
Conversely, I filed the orthogonality thesis away as a counterargument to an argument that I’ve heard a few times, something like, “Smarter people tend to be nicer, so we shouldn’t worry about superintelligence, because it’ll just be super nice.” A weak orthogonality thesis, a counterargument that just shows that that is not necessarily the case, is enough to defend the case for worry.
I think I subscribe to a stronger formulation of the orthogonality thesis, but I’d have to think long and hard to come up with ways in which that would matter. (I’m sure it does in some subtle ways.)
Thanks for the thorough argumentation!
I’m unsure whether this is just a nitpick or too much of a personal take, but which precise version of the orthogonality thesis goes through has little effect on how worried I am about AGI, and I’m worried that the nonexpert reader this article is meant for will come away thinking that it does.
The argument for worry about AGI is, in my mind, carried by:
Cooperation failures in multipolar AGI takeoffs, and
Instrumental convergence around self-preservation (and maybe resource acquisition but not sure),
combined with the consideration that it might all go well a thousand times, but when the 1042nd AGI is started, it might not. And I imagine that once AGIs are useful, lots of them will be started by many different actors.
Conversely, I filed the orthogonality thesis away as a counterargument to an argument that I’ve heard a few times, something like, “Smarter people tend to be nicer, so we shouldn’t worry about superintelligence, because it’ll just be super nice.” A weak orthogonality thesis, a counterargument that just shows that that is not necessarily the case, is enough to defend the case for worry.
I think I subscribe to a stronger formulation of the orthogonality thesis, but I’d have to think long and hard to come up with ways in which that would matter. (I’m sure it does in some subtle ways.)