The superintelligence is misaligned with our own objectives but is benign
I don’t see how this is possible. There is nothing like “a little misalignment”. Keep in mind that creating an unstoppable and uncontrollable AI is a one-shot event that can’t be undone and will have extremely wide and long-term effects on everything. If this AI is misaligned even very slightly, the differences between its goals and humanity’s will aggregate and increase over time. It’s similar to launching a rocket without any steering mechanism with the aim of landing it on the Jupiter moon Europa: You have to set every parameter exactly right or the rocket will miss the target by far. Even the slightest deviance, like e.g. an unaccounted-for asteroid passing by close to the rocket and altering its course very slightly due to gravitational effects, will completely ruin the mission.
On the other hand, if we manage to build an AGI that is “docile” and “corrigible” (which I doubt very much we can do), this would be similar to having a rocket that can be steered from afar: In this case, I would say it is fully aligned, even if corrections are necessary once in a while.
Should we end up with both—a misaligned and an aligned AGI, or more of them—it is very likely that the worst AGI (from humanity’s perspective) will win the battle for world supremacy, so this is more or less the same as just having one misaligned AGI.
My personal view on your subject is that you don’t have to work in AI to shape its future. You can also do that by bringing the discussion into the public and create awareness for the dangers. This is especially relevant, and may even be more effective than a career in an AI lab, if our only chance for survival is to prevent a misaligned AI, at least until we have solved alignment (see my post on “red lines”).
“The superintelligence is misaligned with our own objectives but is benign”. You could have an AI with some meta-cognition, able to figure out what’s good and maximizing it in the same way EAs try to figure out what’s good and maximize it with parts of their life. This view mostly make sense if you give some credence to moral realism.
“My personal view on your subject is that you don’t have to work in AI to shape its future.” Yes, that’s what I wrote in the post.
“You can also do that by bringing the discussion into the public and create awareness for the dangers.” I don’t think it’s a good method and I think you should target a much more specific public but yes, I know what you mean.
You could have an AI with some meta-cognition, able to figure out what’s good and maximizing it in the same way EAs try to figure out what’s good and maximize it with parts of their life.
I’m not sure how that would work, but we don’t need to discuss it further, I’m no expert.
I don’t think it’s a good method and I think you should target a much more specific public but yes, I know what you mean.
What exactly do you think is “not good” about a public discussion of AI risks?
I don’t see how this is possible. There is nothing like “a little misalignment”. Keep in mind that creating an unstoppable and uncontrollable AI is a one-shot event that can’t be undone and will have extremely wide and long-term effects on everything. If this AI is misaligned even very slightly, the differences between its goals and humanity’s will aggregate and increase over time. It’s similar to launching a rocket without any steering mechanism with the aim of landing it on the Jupiter moon Europa: You have to set every parameter exactly right or the rocket will miss the target by far. Even the slightest deviance, like e.g. an unaccounted-for asteroid passing by close to the rocket and altering its course very slightly due to gravitational effects, will completely ruin the mission.
On the other hand, if we manage to build an AGI that is “docile” and “corrigible” (which I doubt very much we can do), this would be similar to having a rocket that can be steered from afar: In this case, I would say it is fully aligned, even if corrections are necessary once in a while.
Should we end up with both—a misaligned and an aligned AGI, or more of them—it is very likely that the worst AGI (from humanity’s perspective) will win the battle for world supremacy, so this is more or less the same as just having one misaligned AGI.
My personal view on your subject is that you don’t have to work in AI to shape its future. You can also do that by bringing the discussion into the public and create awareness for the dangers. This is especially relevant, and may even be more effective than a career in an AI lab, if our only chance for survival is to prevent a misaligned AI, at least until we have solved alignment (see my post on “red lines”).
“The superintelligence is misaligned with our own objectives but is benign”.
You could have an AI with some meta-cognition, able to figure out what’s good and maximizing it in the same way EAs try to figure out what’s good and maximize it with parts of their life. This view mostly make sense if you give some credence to moral realism.
“My personal view on your subject is that you don’t have to work in AI to shape its future.”
Yes, that’s what I wrote in the post.
“You can also do that by bringing the discussion into the public and create awareness for the dangers.”
I don’t think it’s a good method and I think you should target a much more specific public but yes, I know what you mean.
I’m not sure how that would work, but we don’t need to discuss it further, I’m no expert.
What exactly do you think is “not good” about a public discussion of AI risks?