It’s perhaps also worth separating the claims that A) previous alignment research was significantly less helpful than today’s research and B) the reason that was the case continues to hold today.
I think I’d agree with some version of A, but strongly disagree with B.
The reason that A seems probably true to me is that we didn’t know the basic paradigm in which AGI would arise, and so previous research was forced to wander in the dark. You might also believe that today’s focus on empirical research is better than yesterday’s focus on theoretical research (I don’t necessarily agree) or at least that theoretical research without empirical feedback is on thin ice (I agree).
I think most people now think that deep learning, perhaps with some modifications, will be what leads to AGI—some even think that LLM-like systems will be sufficient. And the shift from primarily theoretical research to primarily empirical research has already happened. So what will cause today’s research to be worse than future research with more capable models? You can appeal to a general principle of “unknown unknowns,” but if you genuinely believe that deep learning (or LLMs) will eventually be used in future AGI, it seems hard to believe that knowledge won’t transfer at all.
It’s perhaps also worth separating the claims that A) previous alignment research was significantly less helpful than today’s research and B) the reason that was the case continues to hold today.
I think I’d agree with some version of A, but strongly disagree with B.
The reason that A seems probably true to me is that we didn’t know the basic paradigm in which AGI would arise, and so previous research was forced to wander in the dark. You might also believe that today’s focus on empirical research is better than yesterday’s focus on theoretical research (I don’t necessarily agree) or at least that theoretical research without empirical feedback is on thin ice (I agree).
I think most people now think that deep learning, perhaps with some modifications, will be what leads to AGI—some even think that LLM-like systems will be sufficient. And the shift from primarily theoretical research to primarily empirical research has already happened. So what will cause today’s research to be worse than future research with more capable models? You can appeal to a general principle of “unknown unknowns,” but if you genuinely believe that deep learning (or LLMs) will eventually be used in future AGI, it seems hard to believe that knowledge won’t transfer at all.