Ben Millwood comments on Objections to Value-Alignment between Effective Altruists

Ben Millwood 19 Jul 2020 14:34 UTC
32 points
0 ∶ 0
Here are a couple of interpretations of value alignment:
- A pretty tame interpretation of “value-aligned” is “also wants to do good using reason and evidence”. In this sense, distinguishing between value-aligned and non-aligned hires is basically distinguishing between people who are motivated by the cause and people who are motivated by the salary or the prestige or similar. It seems relatively uncontroversial that you’d want to care about this kind of alignment, and I don’t think it reduces our capacity for dissent: indeed people are only really motivated to tell you what’s wrong with your plan to do good if they care about doing good in the first place. I think your claim is not that “all value-alignment is bad” but rather “when EAs talk about value-alignment, they’re talking about something much more specific and constraining than this tame interpretation”. I’d be interested in whether you agree.
- Another (potentially very specific and constraining) interpretation of “value alignment” that I understand people to be talking about when they’re hiring for EA roles is “I can give this person a lot of autonomy and they’ll still produce results that I think are good”. This recommends people who essentially have the same goals and methods as you right down to the way they affect decisions about how to do your job. Hiring people like that means that you tax your management capacity comparatively less and don’t need to worry so much about incentive design. To the extent that this is a big focus in EA hiring it could be because we have a deficit of management capacity and/or it’s difficult to effectively manage EA work. It certainly seems like EA research is often comparatively exploratory / preliminary and therefore underspecified, and so it’s very difficult to delegate work on it except to people who are already in a similar place to you on the matter.
- weeatquince 31 Jul 2020 17:30 UTC
  12 points
  0 ∶ 0
  Parent
  I think your claim is not that “all value-alignment is bad” but rather “when EAs talk about value-alignment, they’re talking about something much more specific and constraining than this tame interpretation”.
  To attempt an answer on behalf of the author. The author says “an increasingly narrow definition of value-alignment” and I think the idea is that seeking “value-alignment” has got narrower and narrower over term and further from the goal of wanting to do good.
  In my time in EA value alignment has, among some folk, gone from the tame meaning you provide of really wanting to figure out how to do good to a narrower meaning such as: you also think human extinction is the most important thing.