This is one of the reasons I care about AI in the first place, and it’s a relief to see someone talking about it. I’d love to see research on the question: “Conditional on the AI alignment problem being ‘solved’ to some extent, what happens to animals the next hundred years after that?”
How much does it matter for the future of animal welfare whether current AI researchers care about animals?
Should responsible animal advocates consider trying hard to become AI researchers?
If by magic we ‘solve’ AI by making it corrigible-to-a-certain-group-of-people, and that corrigible AI is still (by magic) able to do pivotal things like prevent other powerfwl AIs from coming into existence, then the values of that group could matter a lot.
How likely is it that some values get ‘locked in’ for some versions of ‘solved AI’? It doesn’t matter whether you think locked-in values doesn’t count as ‘solved’. I’m not here to debate definitions, just figure out how important it is to get some concern for animals in there if the set of values is more or less inelastic to later changes in human values, e.g. due to organizational culture or intentional learning rate decay in its value learning function or something I have no clue.
For exactly the same reasons it could be hard for the AI to understand human preferences due to ‘The Pointers Problem’, it is (admittedly to a lesser extent) hard for humans to understand animal preferences due to the ‘Umwelt Problem’: what animals care about is a function of how they see their own environment, and we might expect less convergence in latent categories between the umwelts of lesser intelligences. So if an AI being aligned means that it cares about animals to the extent humans do, it could still be unaligned with respect to the animals’ own values to the extent humans are mistaken about them (which we most certainly are).
So if an AI being aligned means that it cares about animals to the extent humans do, it could still be unaligned with respect to the animals’ own values to the extent humans are mistaken about them (which we most certainly are).
I very much agree with this. This will actually be one of the topics I will research in the next 12 months, with Peter Singer.
Love this. It’s one of the things on my “possible questions to think about at some point” list. My motivation would be
Try to figure out what specific animals care about. (A simple sanity check here is to try to figure out what a human cares about, which is hard enough. Try expand this question to humans from different cultures, and it quickly gets more and more complicated.)
Try to figure out how I’m figuring out what animals care about. This is the primary question, because we want to generalize the strategies for helping beings that care about different things than us. This is usefwl not just for animals, but also as a high-level approach to the pointers problem in the human case as well.
Most of the value of the project comes from 2, so I would pay very carefwl attention to what I’m doing when trying to answer 1. Once I make an insight on 1, what general features led me to that insight?
This is one of the reasons I care about AI in the first place, and it’s a relief to see someone talking about it. I’d love to see research on the question: “Conditional on the AI alignment problem being ‘solved’ to some extent, what happens to animals the next hundred years after that?”
Some butterfly considerations:
How much does it matter for the future of animal welfare whether current AI researchers care about animals?
Should responsible animal advocates consider trying hard to become AI researchers?
If by magic we ‘solve’ AI by making it corrigible-to-a-certain-group-of-people, and that corrigible AI is still (by magic) able to do pivotal things like prevent other powerfwl AIs from coming into existence, then the values of that group could matter a lot.
How likely is it that some values get ‘locked in’ for some versions of ‘solved AI’? It doesn’t matter whether you think locked-in values doesn’t count as ‘solved’. I’m not here to debate definitions, just figure out how important it is to get some concern for animals in there if the set of values is more or less inelastic to later changes in human values, e.g. due to organizational culture or intentional learning rate decay in its value learning function or something I have no clue.
For exactly the same reasons it could be hard for the AI to understand human preferences due to ‘The Pointers Problem’, it is (admittedly to a lesser extent) hard for humans to understand animal preferences due to the ‘Umwelt Problem’: what animals care about is a function of how they see their own environment, and we might expect less convergence in latent categories between the umwelts of lesser intelligences. So if an AI being aligned means that it cares about animals to the extent humans do, it could still be unaligned with respect to the animals’ own values to the extent humans are mistaken about them (which we most certainly are).
I very much agree with this. This will actually be one of the topics I will research in the next 12 months, with Peter Singer.
Love this. It’s one of the things on my “possible questions to think about at some point” list. My motivation would be
Try to figure out what specific animals care about. (A simple sanity check here is to try to figure out what a human cares about, which is hard enough. Try expand this question to humans from different cultures, and it quickly gets more and more complicated.)
Try to figure out how I’m figuring out what animals care about. This is the primary question, because we want to generalize the strategies for helping beings that care about different things than us. This is usefwl not just for animals, but also as a high-level approach to the pointers problem in the human case as well.
Most of the value of the project comes from 2, so I would pay very carefwl attention to what I’m doing when trying to answer 1. Once I make an insight on 1, what general features led me to that insight?