On my reading, “human-level automated alignment researcher” means a system that is human-level at alignment research, but not AGI. You can take the position that in order to be human-level at alignment research, it will need to be AGI, but I don’t think that’s necessarily true, and in any case it’s certainly not obvious. For myself, I keep being surprised at how capable systems can get at particular abilities without being fully general. (Years ago I wrongly believed that AGI would be necessary for artificial systems to reach the level of language capability they have right now; back in the 70′s, Hofstadter wrongly believed AGI would be necessary for superhuman chess ability; etc.)
It’s hard to imagine more general and capability-demanding activity as doing good (superhuman!) science in such an absurdly cross-disciplinary field as AI safety (and among the disciplines that are involved there are those that are notoriously not very scientific yet: psychology, sociology, economics, the studies of consciousness, ethics, etc.). So if there is an AI that can do that but still is not counted as AGI, I don’t know what the heck ‘AGI’ should even refer to. Compare with chess, which is a very narrow problem which can be formally defined and doesn’t require AI to operate with any science (and world models) whatsoever.
On my reading, “human-level automated alignment researcher” means a system that is human-level at alignment research, but not AGI. You can take the position that in order to be human-level at alignment research, it will need to be AGI, but I don’t think that’s necessarily true, and in any case it’s certainly not obvious. For myself, I keep being surprised at how capable systems can get at particular abilities without being fully general. (Years ago I wrongly believed that AGI would be necessary for artificial systems to reach the level of language capability they have right now; back in the 70′s, Hofstadter wrongly believed AGI would be necessary for superhuman chess ability; etc.)
It’s hard to imagine more general and capability-demanding activity as doing good (superhuman!) science in such an absurdly cross-disciplinary field as AI safety (and among the disciplines that are involved there are those that are notoriously not very scientific yet: psychology, sociology, economics, the studies of consciousness, ethics, etc.). So if there is an AI that can do that but still is not counted as AGI, I don’t know what the heck ‘AGI’ should even refer to. Compare with chess, which is a very narrow problem which can be formally defined and doesn’t require AI to operate with any science (and world models) whatsoever.