I endorse the argument we should figure out how to use LLM-based systems without accidentally torturing them because they’re more likely to take catastrophic actions if we’re torturing them.
I haven’t tried to understand the argument we should try to pay AIs to [not betray us / tell on traitors / etc.] and working on AI-welfare stuff would help us offer AIs payment better; there might be something there.
I don’t understand the decision theory mumble mumble argument; there might be something there.
(Other than that, it seems hard to tell a story about how “AI welfare” research/interventions now could substantially improve the value of the long-term future.)
(My impression is these arguments are important to very few AI-welfare-prioritizers / most AI-welfare-prioritizers have the wrong reasons.)
Caveats:
I endorse the argument we should figure out how to use LLM-based systems without accidentally torturing them because they’re more likely to take catastrophic actions if we’re torturing them.
I haven’t tried to understand the argument we should try to pay AIs to [not betray us / tell on traitors / etc.] and working on AI-welfare stuff would help us offer AIs payment better; there might be something there.
I don’t understand the decision theory mumble mumble argument; there might be something there.
(Other than that, it seems hard to tell a story about how “AI welfare” research/interventions now could substantially improve the value of the long-term future.)
(My impression is these arguments are important to very few AI-welfare-prioritizers / most AI-welfare-prioritizers have the wrong reasons.)
FWIW, these motivations seem reasonably central to me personally, though not my only motivations.
Among your friends, I agree; among EA Forum users, I disagree.
Yes, I meant central to me personally, edited the comment to clarify.