This is very true. However, the OP’s point still helps us, as AI that is simultaneously smart enough to be useful in a narrow domain, misaligned, but also too stupid to take over the world could help us reduce xrisk. In particular, if it is superhumanly good at alignment research, then it could output good alignment research as part of its deception phase. This would help reduce the risk from future AI’s significantly without causing xrisk as, ex hypothesi, the AI is too stupid to take over. The main question here is whether an AI could be smart enough to do very good alignment research and also too stupid to take over the world if it tried. I am skeptical but pretty uncertain, so I would give it at least a 10% chance of being true, and maybe higher.
This is very true. However, the OP’s point still helps us, as AI that is simultaneously smart enough to be useful in a narrow domain, misaligned, but also too stupid to take over the world could help us reduce xrisk. In particular, if it is superhumanly good at alignment research, then it could output good alignment research as part of its deception phase. This would help reduce the risk from future AI’s significantly without causing xrisk as, ex hypothesi, the AI is too stupid to take over. The main question here is whether an AI could be smart enough to do very good alignment research and also too stupid to take over the world if it tried. I am skeptical but pretty uncertain, so I would give it at least a 10% chance of being true, and maybe higher.