ElizabethBarnes comments on Why I prioritize moral circle expansion over reducing extinction risk through artificial intelligence alignment

ElizabethBarnes 13 Apr 2018 15:48 UTC
4 points
0 ∶ 0
Thanks very much for writing this, and thanks to Greg for funding it! I think this is a really important discussion. Some slightly rambling thoughts below.

We can think about 3 ways of improving the EV of the far future:

1: Changing incentive structures experienced by powerful agents in the future (e.g. avoiding arms races, power struggles, selection pressures)

2: a) Changing the moral compass of powerful agents in the future in specific directions (e.g. MCE).

b) Indirect ways to improve the moral compass of powerful agents in the future (e.g. philosophy research, education, intelligence/empathy enhancement)

All of these are influenced both by strategies such as activism, improving institutions, and improving education, as well as by AIA. I am inclined to think of AIA as a particularly high-leverage point at which we can have influence on these.

However, these are issues are widely encountered. Consider 2b: we have to decide how to educate the next generation of humans, and they may well end up with ethical beliefs that are different from ours, so we must judge how much to try and influence or constrain them, and how much to accept that the changes are actually progress. This is similar to the problem of defining CEV: we have some vague idea of the direction in which better values lie (more empathy, more wisdom, more knowledge), but we can’t say exactly what the values should be. For this intervention, working on AIA may be more important than activism because it has more leverage—it is likely to be more tractable and have greater influence on the future than the more diffuse ways in we can push on education and intergenerational moral progress.

This framework also suggests that MCE is just one example of a collection of similar interventions. MCE involves pushing for a fairly specific belief and behaviour change on a principle that’s fairly uncontroversial. You could also imagine similar interventions—for instance, helping people reduce unwanted aggressive or sadistic behaviour. We could call this something like ‘uncontroversial moral progress’: helping individuals and civilisation to live by their values more. (on a side note: sometimes I think of this as the minimal core of EA: trying to live according to your best guess of what’s right)

The choice between working on 2a and 2b depends, among other things, on your level of moral uncertainty.

I am inclined to think that AIA is the best way to work on 1 and 2b, as it is a particularly high-leverage intervention point to shape the power structures and moral beliefs that exist in the future. It gives us more of a clean slate to design a good system, rather than having to work within a faulty system.

I would really like to see more work on MCE and other examples of ‘uncontroversial moral progress’. Historical case studies of value changes seem like a good starting point, as well as actually testing the tractability of changing people’s behaviour.

I also really appreciated your perspective on different transformative AI scenarios, as I’m worried I’m thinking about it in an overly narrow way.