What actually is “value-alignment”?
Preamble
This is an extract from a post called “Doing EA Better”, which argued that EA’s new-found power and influence obligates us to solve our movement’s significant problems with respect to epistemics, rigour, expertise, governance, and power.
We are splitting DEAB up into a sequence to facilitate object-level discussion.
Each post will include the relevant parts of the list of suggested reforms. There isn’t a perfect correspondence between the subheadings of the post and the reforms list, so not all reforms listed will be 100% relevant to the section in question.
Finally, we have tried (imperfectly) to be reasonably precise in our wording, and we ask that before criticising an argument of ours, commenters ensure that it is an argument that we are in fact making.
Main
Summary: The use of the term “value-alignment” in the EA community hides an implicit community orthodoxy. When people say “value-aligned” they typically do not mean a neutral “alignment of values”, nor even “agreement with the goal of doing the most good possible”, but a commitment to a particular package of views. This package, termed “EA orthodoxy”, includes effective altruism, longtermism, utilitarianism, Rationalist-derived epistemics, liberal-technocratic philanthropy, Whig historiography, the ITN framework, and the Techno-Utopian Approach to existential risk.
The term “value-alignment” gets thrown around a lot in EA, but is rarely actually defined. When asked, people typically say something about similarity or complementarity of values or worldviews, and this makes sense: “value-alignment” is of course a term defined in reference to what values the subject is (un)aligned with. You could just as easily speak of alignment with the values of a political party or a homeowner’s association.[6]
However, the term’s usage in EA spaces typically has an implicit component: value-alignment with a set of views shared and promoted by the most established and powerful components of the EA community. Thus:
Value-alignment = the degree to which one subscribes to EA orthodoxy
EA orthodoxy = the package of beliefs and sensibilities generally shared and promoted by EA’s core institutions (the CEA, FHI, OpenPhil, etc.)[7]
These include, but are not limited to:
Effective Altruism
i.e. trying to “do the most good possible”
i.e. believing that positively influencing the long-term future is a (or even the) key moral priority of our time
Utilitarianism, usually Total Utilitarianism
Rationalist-derived epistemics
Most notably subjective Bayesian “updating” of personal beliefs
Liberal-technocratic philanthropy
A broadly Whiggish/progressivist view of history
Best exemplified by Steven Pinker’s “Enlightenment Now”
Cause-prioritisation according to the ITN framework
The Techno-Utopian Approach to existential risk, which includes for instance, and in addition to several of the above:
Defining “existential risk” in reference to humanity’s “long-term potential” to generate immense amounts of (utilitarian) value by populating the cosmos with vast numbers of extremely technologically advanced beings
A methodological framework based on categorising individual “risks”[8], estimating for each a probability of causing an “existential catastrophe” within a given timeframe, and attempting to reduce the overall level of existential risk largely by working on particular “risks” in isolation (usually via technical or at least technocratic means)
Technological determinism, or at least a “military-economic adaptationism” that is often underpinned by an implicit commitment to neorealist international relations theory
A willingness to seriously consider extreme or otherwise exceptional actions to protect astronomically large amounts of perceived future value
There will naturally be exceptions here – institutions employ many people, whose views can change over time – but there are nonetheless clear regularities
Note that few, if any, of the components of orthodoxy are necessary aspects, conditions, or implications of the overall goal of “doing the most good possible”. It is possible to be an effective altruist without subscribing to all, or even any, of them, with the obvious exception of “effective altruism” itself.
However, when EAs say “value-aligned” they rarely seem to mean that one is simply “dedicated to doing the most good possible”, but that one subscribes to the particular philosophical, political, and methodological views packaged under the umbrella of orthodoxy.
Suggested reforms
Below, we have a preliminary non-exhaustive list of relevant suggestions for structural and cultural reform that we think may be a good idea and should certainly be discussed further.
It is of course plausible that some of them would not work; if you think so for a particular reform, please explain why! We would like input from a range of people, and we certainly do not claim to have all the answers!
In fact, we believe it important to open up a conversation about plausible reforms not because we have all the answers, but precisely because we don’t.
Italics indicates reforms strongly inspired by or outright stolen from Zoe Cremer’s list of structural reform ideas. Some are edited or merely related to her ideas; they should not be taken to represent Zoe’s views.
Asterisks (*) indicate that we are less sure about a suggestion, but sure enough that we think they are worth considering seriously, e.g. through deliberation or research. Otherwise, we have been developing or advocating for most of these reforms for a long time and have a reasonable degree of confidence that they should be implemented in some form or another.
Timelines are suggested to ensure that reforms can become concrete. If stated, they are rough estimates, and if there are structural barriers to a particular reform being implemented within the timespan we suggest, let us know!
Categorisations are somewhat arbitrary, we just needed to break up the text for ease of reading.
Critique
Institutions
Funding bodies should enthusiastically fund deep critiques and other heterodox/“heretical” work
Red Teams
The judging panels of criticism contests should include people with a wide variety of views, including heterodox/”heretical” views
Epistemics
General
When EAs say “value-aligned”, we should be clear about what we mean
Aligned with what values in particular?
We should avoid conflating the possession of the general goal of “doing the most good” with subscription to the full package of orthodox views
EAs should consciously separate
An individual’s suitability for a particular project, job, or role
Their expertise and skill in the relevant area(s)
The degree to which they are perceived to be “highly intelligent”
Their perceived level of value-alignment with EA orthodoxy
Their seniority within the EA community
Their personal wealth and/or power
EAs should make a point of engaging with and listening to EAs from underrepresented disciplines and backgrounds, as well as those with heterodox/“heretical” views
Quantification
EAs should be wary of the potential for highly quantitative forms of reasoning to (comparatively easily) justify anything
We should be extremely cautious about e.g. high expected value estimates, very low probabilities being assigned to heterodox/“heretical” views, and ruin risks
Diversity
People with heterodox/“heretical” views should be actively selected for when hiring to ensure that teams include people able to play “devil’s advocate” authentically, reducing the need to rely on highly orthodox people accurately steel-manning alternative points of view
Expertise & Rigour
Rigour
Work should be judged on its quality, rather than the perceived intelligence, seniority or value-alignment of its author
EAs should avoid assuming that research by EAs will be better than research by non-EAs by default
Reading
Insofar as a “canon” is created, it should be of the best-quality works on a given topic, not the best works by (orthodox) EAs about (orthodox) EA approaches to the topic
Reading lists, fellowship curricula, and bibliographies should be radically diversified
We should search everywhere for pertinent content, not just the EA Forum, LessWrong, and the websites of EA orgs
We should not be afraid of consulting outside experts, both to improve content/framing and to discover blind-spots
EAs should see fellowships as educational activities first and foremost, not just recruitment tools
Experts & Expertise
When hiring for research roles at medium to high levels, EA institutions should select in favour of domain-experts, even when that means passing over a highly “value-aligned” or prominent EA
Funding & Employment
Grantmaking
Grantmakers should be radically diversified to incorporate EAs with a much wider variety of views, including those with heterodox/”heretical” views
A certain proportion EA of funds should be allocated by lottery after a longlisting process to filter out the worst/bad-faith proposals*
The outcomes of this process should be evaluated in comparison to EA’s standard grantmaking methods as well as other alternatives
Employment
More people working within EA should be employees, with the associated legal rights and stability of work, rather than e.g. grant-dependent “independent researchers”
EA funders should explore the possibility of funding more stable, safe, and permanent positions, such as professorships
Governance & Hierarchy
Decentralisation
EA institutions should see EA ideas as things to be co-created with the membership and the wider world, rather than transmitted and controlled from the top down
The community health team and grantmakers should offer community groups more autonomy, independence, and financial stability
Community-builders should not worry about their funding being cut if they disagree with the community health team or appear somewhat “non-value-aligned”
Contact Us
If you have any questions or suggestions about this article, EA, or anything else, feel free to email us at concernedEAs@proton.me
I disagree with the aptness of most of the meanings for “value alignment” you’ve proposed, and want to argue for the definition that I believe is correct. (I’m not disputing your claim that people often use “value-aligned” to mean “agrees with EA orthodoxy”, but I am claiming that those people are misusing the term.)
The true meaning of “value-aligned [with effective altruism]” is that someone:
places nonzero value on benefits to others (i.e. would be willing to pay some personal cost in order to make a benefit happen to someone else, even if they themself get absolutely none of the benefit)
believes that helping more is better than helping less
For example, you argue that those with heretical opinions or non-liberal-technocratic political views are flagged by EA orgs as “not value-aligned”. I think these people actually are value-aligned as long as they meet the above two criteria. I myself used to have extremely left-wing political views as a teenager, which I don’t think would disqualify someone from that status (and I would say I was already value-aligned back then). Even a socialist who spends their days sabotaging foreign aid shipments is value-aligned with us, if they’re doing this out of the belief that it’s the most effective way to help, but would switch to donating to GiveWell and working on AI alignment if they changed their beliefs on factual propositions about the scale and tractability of various problems.
I would be tempted to add something about being truth-seeking as well. So, is someone interested in updating their beliefs about what is more effective, or is this the last thing that they would want?
I think truth-seeking to the extent that you’re constantly trying to falsify your existing beliefs and find out if they should be different is too high a bar for this, but the first two conditions entail some lesser degree of truth-seeking. Like if you’re an eco-terrorist who bombs nuclear plants, but unbeknownst to you, coal plants are worse for the environment than nuclear plants, and someone informs you of that, you’d at a minimum switch to bombing coal plants, rather than ignoring the new information and continuing with your existing intervention. Seeking better opportunities and questioning your current plans is admirable and a positive thing many EAs do, but I don’t think it’s part of the minimum requirement for value alignment. I can think of a certain field where a lot of EAs work who don’t meet such a standard.
Even this is an addition to your claimed values—the concept of tractability as understood in EA is only a proxy for impact, and often a bad one, not fitting cases where effect is non-linear. For example, if you’re a communist who believes the world would be much better after the revolution, obviously you’re going to have zero effect for most of the time, but then a large one if and when the revolution comes.
This exactly exemplifies the way that unconnected ideas creep in when we talk about being EA-aligned, even when we think it only means those two central values.
I’ll post a more thorough engagement of this post later, and thanks again for breaking it into chunks. But I have a repeated question of the authors—can you please define what a “deep critique” is? How does it differ at all from a normal critique?[1]
I think you’ve used that term in all the Doing EA Better posts, along with other comments. But I couldn’t find where the term was defined at all, and the examples given seem not to actually point me towards an understanding. After a bit of (admittedly very brief) Google-Fu, the best reference I could find was the title of Chapter 7 in this book—but on the two pages there you can view for free the term also isn’t defined!
If this is just a linguistic/definitional term then fine, but I think you’re trying to point out something more. I’d definitely appreciate the authors helping me clear up my understanding a bit more so that I can engage with DEAB more productively :)
My vague impression seems to be that it’s a critique that contradicts the fundamentals of a belief-system. In which case fine, but in that case any organisation accepting a deep critique would end up undermining itself. It would be like the Catholic Church accepting “Jesus was not the divine and he is not the son of God, plus we’re not that confident in this God business anyway...” or is that me being uncharitable?
Hi JWS,
The term is explored in an upcoming section, here.
I appreciate you splitting this into a series, it makes this meaty set of critiques much easier to digest!
All of the orthodoxies are positives, but this one I think cuts more at the joints if you made it negative.
I think of this one as “rejection of standpoint epistemology”, or perhaps even better: “standpoint epistemology is the exception, not than the rule”
I’ve talked to some philosophers about this, twice I’ve been told that SE isn’t a proper tradition recognized by epistemologists, that it’s more of an ideological cudgel that people use to signal allegiance but doesn’t have a free-standing account of what it “really is” that people agree on. But in the world you encounter people with intuitions about it, and these intuitions end up informing them a lot, so I think it’s useful to talk about even if it can’t be done at the standards of academic philosophy.
effectivealtruism.org suggests that EA values include:
proper prioritization: appreciating scale of impact, and trying for larger scale impact (for example, helping more people)
impartial altruism: giving everyone’s interests equal weight
open truth-seeking: including willingness to make radical changes based on new evidence
collaborative spirit: involving honesty, integrity, and compassion, and paying attention to means, not just ends.
Cargill Corporation lists its values as:
Do the Right Thing
Put People First
Reach Higher
Lockheed-Martin Corporation lists its values as:
Do What’s Right
Respect Others
Perform with Excellence
Shell Global Corporation lists its values as:
Integrity
Honesty
Respect
Short lists seem to be a trend, but longer lists with a different label than “values” appear from other corporations(for example, from Google or General Motors) . They all share the quality of being aspirational, but there’s a difference with the longer lists, they seem closer suited to the specifics of what the corporations do.
Consider Google’s values:
Focus on the user and all else will follow.
It’s best to do one thing really, really well.
Fast is better than slow.
Democracy on the web works.
You don’t need to be at your desk to need an answer.
You can make money without doing evil. .
There’s always more information out there.
The need for information crosses all borders
You can be serious without a suit
Great just isn’t good enough
Google values are specific. Their values do more than build their brand.
I would like to suggest that EA values are lengthy and should be specific enough to:
identify your unique attributes.
focus your behavior.
reveal your preferred limitations[1].
Having explicit values of that sort:
limit your appeal.
support your integrity .
encourage your honesty.
The values focus and narrow in addition to building your brand. Shell Global, Lockheed-Martin and Cargill are just building their brand. The Google Philosophy says more and speaks to their core business model.
All the values listed as part of Effective Altruism appear to overlap with the concerns that you raise. Obviously, you get into specifics.
You offer specific reforms in some areas. For example:
“A certain proportion EA of funds should be allocated by lottery after a longlisting process to filter out the worst/bad-faith proposals*”
“More people working within EA should be employees, with the associated legal rights and stability of work, rather than e.g. grant-dependent ‘independent researchers’.”
These do not appear obviously appropriate to me. I would want to find out what a longlisting process is, and why employees are a better approach than grant-dependent researchers. A little explanation would be helpful.
However, other reforms do read more like statements of value or truisms to me. For example:
“Work should be judged on its quality...”[rather than its source].
“EAs should be wary of the potential for highly quantitative forms of reasoning to (comparatively easily) justify anything”
It’s a truism that statistics can justify anything as in the Mark Twain saying, “There are three kinds of lies: lies, damned lies, and statistics”.
These reforms might inspire values like:
judge work on its quality alone, not its source
Use quantitative reasoning only when appropriate
*You folks put a lot of work into writing this up for EA’s. You’re smart, well-informed, and I think you’re right, where you make specific claims or assert specific values. All I am thinking about here is how to clarify the idea of aligning with values, the values you have, and how to pursue them. *
You wrote that you started with a list of core principles before writing up your original long post? I would like to see that list, if it’s not too late and you still have the list. If you don’t want to offer the list now, maybe later? As a refinement of what you offered here?
Something like the Google Philosophy, short and to the point, will make it clear that you’re being more than reactive to problems, but instead actually have either:
differences in values from orthodox EA’s
differences in what you perceive as achievement of EA values by orthodox EA’s
Here are a few prompts to help define your version of EA values:
EA’s emphasize quantitative approaches to charity, as part of maximizing their impact cost-effectively. Quantitative approaches have pros and cons, so how to contextualize them? They don’t work in all cases, but that’s not a bad thing. Maybe EA should only pay attention to contexts where quantitative approaches do work well. Maybe that limits EA flexibility and scope of operations, but also keeps EA integrity, accords with EA beliefs, and focuses EA efforts. You have specific suggestions about IBT and what makes a claim of probabilistic knowledge feasible. Those can be incorporated into a value statement. Will you help EA focus and limit its scope or are you aiming to improve EA flexibility because that’s necessary in every context where EA operates?
EA’s emphasize existential risk causes. ConcernedEA’s offer specific suggestions to improve EA research into existential risk. How would you inform EA values about research in general to include what you understand should be the EA approach to existential risk research? You heed concerns about evaluation of cascading and systemic risks. How would those specific concerns inform your values?
You have specific concerns about funding arrangements, nepotism, and revolving doors between organizations. How would those concerns inform your values about research quality or charity impact?
You have concerns about lack of diversity and its impact on group epistemics. What should be values there?
You can see the difference between brand-building:
ethicality
impactfulness
truth-seeking
and getting specific
research quality
existential, cascading, and systemic risks
scalable and impactful charity
quantitative and qualitative reasoning
multi-dimensional diversity
epistemic capability
democratized decision-making
That second list is more specific, plausibly hits the wrong notes for some people, and definitely demonstrates particular preferences and beliefs. As it should! Whatever your list looks like, would alignment with its values imply the ideal EA community for you? That’s something you could take another look at, articulating the values behind specific reforms if those are not yet stated or incorporating specific reforms into the details of a value, like:
democratized decision-making: incorporating decision-making at multiple levels within the EA community, through employee polling, yearly community meetings, and engaging charity recipients.
I don’t know whether you like the specific value descriptors I chose there. Perhaps I misinterpreted your values somewhat. You can make your own list. Making decisions in alignment with values is the point of having values. If you don’t like the decisions, the values, or if the decisions don’t reflect the values, the right course is to suggest alterations somewhere, but in the end, you still have a list of values, principles, or a philosophy that you want EA to follow.
[1] As I wrote in a few places in this post, and taking a cue from Google and the linux philosophy, sometimes doing one thing and doing it well is preferable to offering loads of flexibility. If EA is supposed to be the swiss-army knife of making change in the world, there’s still a lot of better organizations out there for some purposes rather than others, as any user of a swiss-army knife will attest, they are not ideal for all tasks. Also, your beliefs will inform you about what you do well. Does charity without quantitative metrics inevitably result in waste and corruption? Does use of quantitative metrics limit the applicability of EA efforts to specific types of charity work (for example, outreach campaigns)? Do EA quantitative tools limit the value of its work in existential risk? Can they be expanded with better quantitative tools (or qualitative ones)? Maybe EA is self-limiting because of its preferred worldview, beliefs and tools. Therefore, it has preferred limitations. Which is OK, even good.