I feel like something has gone wrong in this conversation; you have tricked Bob into working on learning from human feedback, rather than convincing him to do so.
I agree with this. If people become convinced to work on AI stuff by specific argument X, then they should definitely go and try to fix X, not something else (e.g. what other people tell them needs doing in AI safety/governance).
I think when I said I wanted a more general argument to be the “default”, I was meaning something very general, that doesn’t clearly imply any particular intervention—like the one in the most important century series, or the “AI is a big deal” argument (I especially like Max Daniel’s version of this).
Then, it’s very important to think clearly about what will actually go wrong, and how to actually fix that. But I think it’s fine to do this once you’re already convinced that you should work on AI, by some general argument.
I’d be really curious if you still disagree with this?
(Apologies for my very slow reply.)
I agree with this. If people become convinced to work on AI stuff by specific argument X, then they should definitely go and try to fix X, not something else (e.g. what other people tell them needs doing in AI safety/governance).
I think when I said I wanted a more general argument to be the “default”, I was meaning something very general, that doesn’t clearly imply any particular intervention—like the one in the most important century series, or the “AI is a big deal” argument (I especially like Max Daniel’s version of this).
Then, it’s very important to think clearly about what will actually go wrong, and how to actually fix that. But I think it’s fine to do this once you’re already convinced that you should work on AI, by some general argument.
I’d be really curious if you still disagree with this?
I agree with that, and that’s what I meant by this statement above: