My current takes on AI Welfare

This is a post for debate week. Feel especially free to disagree and/​or ask for clarifications.

I’m starting off the debate week weakly disagreeing with the debate statement. I’m at a point where I have too many strong uncertainties to think that work on AI Welfare would be positive or negative. I’m not very confident about any of these, and hope to change my mind for good reasons this week! Where possible I am identifying cruxes so that you can tell me why I’m wrong in a way which will change my mind.

Below are some reasons for my current position, in no particular order:

I’m unsure whether we can in principle ascertain whether a digital mind is conscious.

I’m ready to accept the chance of consciousness in animals because of extensive analogies to conscious human behaviour (like anhedonia, avoiding negative stimuli, activating differently given anaesthetics, etc…) plus a shared evolutionary history. Digital minds (or any AI systems that we would have some reason to suspect are conscious) would be developed very differently to human minds, with very different incentives (such as acting in ways that humans prefer). This could lead to behaviour analogous to conscious behaviour in humans, but with a very different mechanism or purpose, which does not actually produce qualia. I do not know how we would be able to tell the difference, even in theory.

A crux here is that philosophy of mind doesn’t really make much progress, and additionally, that we are unlikely to find a convincing science of consciousness.

AI welfare success could mean existential failure

Putting money into AI welfare research and or activism increases the chance of a future where we respect (at least some) AI systems as having moral value, comparable to humans. If we are wrong about this, and they are not in fact conscious, this could be a disaster:

  • In the shorter term, because treating the AI systems nicely might cost resources which could otherwise be used to accelerate technological progress, helping conscious humans and animals.

  • In the longer term, because a world full of professedly happy digital minds which are in fact non-conscious is a world devoid of value.

The worlds where EA involvement in this issue is useful may be very few

The world where EA research and advocacy for AI welfare is most crucial is one where the reasons to think that AI systems are conscious are non-obvious, such that we require research to discover them, and require advocacy to convince the broader public of them. But I think that world where this is true, and the advocacy succeeds, is a pretty unlikely one.

If we are in a world where advocacy for AI Welfare succeeds, then I think it is very likely we are in a world where AI systems which are used by the majority of the population are incentivised to act as if they were conscious, and form close relationships with their users. In this world, the important features of AI systems which advocates for their rights/​ welfare would mention would be surface level/​ very visible. I.e. we would not require research or openness to weird ideas in order to convince people to consider AI rights/​ welfare.

Alternatively, if we are in a world where the signs of true AI consciousness are not visible without research (i.e. they are not isomorphic to features of the AI such as the text it outputs), then 1) research is not likely to change people’s minds if they find EA consciousness very implausible already 2) it is also not likely to change their minds if they find it very plausible, and the research argues that it AI is not in fact conscious. So whether the public is convinced or unconvinced on AI consciousness and welfare, research won’t be a factor.

A crux that I have here is that research that takes a while to explain is not going to inspire a popular movement. This links to another crux, that AI welfare would have to be popular in order to be enforced.