Vaidehi Agarwalla 🔸 comments on The EA movement’s values are drifting. You’re allowed to stay put.

Vaidehi Agarwalla 🔸 24 May 2022 0:58 UTC
57 points
0 ∶ 0
Am in agreement with most of your post, except for one thing: calling these changes to our values.
The following is the beginnings of a chain of thinking that isn’t fully fleshed out, but I hope is useful. All word choices are probably suboptimal. I don’t hold the implications of these views very strongly or at all, I’m mostly trying to puzzle things out and provide arguments I think are somewhat strong or compelling.
All the things you mention don’t seem like values to me—they seem more like strategies or approaches to doing good (which
“Core” values are things like truth-seekingness, epistemic humility or maximizing impact or something, whereas for example “cause neutrality” and by extension “longtermism” are downstream of that.
But we also have “secondary values” (terrible wording) which are influenced by our core values and our worldview and specific (cognitive) beliefs about how the world works (this influence each other but are somewhat independent).
I can see a version of EA where the core values → longtermism chain becomes replaced with just longtermism as a default (just like in current EA we take the core values → helping people in developing countries chain is something of a default—I don’t think it’s very often that people come into EA strongly opposing this value set—this isn’t a bad thing—these are the low hanging fruit).
Why are core & secondary values important to distinguish?
1. People who are on board with the changes do not see the shared values as conflicting with the core values they see it as a natural progression of core values. Just like we thought that “everyone matters” leads to “donate to help improve the lives of poor people in developing countries” so too is the connection between “everyone matters” to “future people should be our priority”.
  1. Implication: people reading this post may say “this isn’t value drift”
2. I think are core values are really important and the real glue of our community, a glue that will withstand the test of time and ideally let us adapt, change and grow as we get new information and we update our beliefs.
  1. Maybe this is to idealistic, and in practice simply saying “but we share the same core values” even if true, is simply not enough.
  2. In practice, the level of secondary values can be more useful: maybe technical AI safety researchers and farmed animal welfare advocates just don’t have that much in common or the inferential distance is a bit too much in terms of their models of the world, impact, risk aversion etc. etc.
- Thomas Kwa 24 May 2022 1:50 UTC
  17 points
  0 ∶ 0
  Parent
  Maybe related is that even for ideal expected utility maximizers, values and subjective probabilities are impossible to disentangle by observing behavior. So it’s not always easy to tell what changes are value drift vs epistemic updates.
  - Vaidehi Agarwalla 🔸 24 May 2022 17:25 UTC
    20 points
    0 ∶ 0
    Parent
    While I understand the point you’re making, the comment you linked is (to my non-STEM mind) pretty hard to parse. Would you be able to give a less technical, more ELI5 explanation?
    - Thomas Kwa 24 May 2022 22:56 UTC
      32 points
      0 ∶ 0
      Parent
      Sure, here’s the ELI12:
      Suppose that there are two billionaires, April and Autumn. Originally they were funding AMF because they thought working on AI alignment would be 0.01% likely to work and solving alignment would be as good as saving 10 billion lives, which is an expected value of 1 million lives, lower than you could get by funding AMF.
      After being in the EA community a while they switched to funding alignment research for different reasons.
      April updated upwards on tractability. She thinks research on AI alignment is 10% likely to work, and solving alignment is as good as saving 10 billion lives.
      Autumn now buys longtermist moral arguments. Autumn thinks research on AI alignment is 0.01% likely to work, and solving alignment is as good as saving 10 trillion lives.
      Both of them assign the same expected utility to alignment-- 1 billion lives. As such they will make the same decisions. So even though April made an epistemic update and Autumn a moral update, we cannot distinguish them from behavior alone.
      This extends to a general principle: actions are driven by a combination of your values and subjective probabilities, and any given action is consistent with many different combinations of utility function and probability distribution.
      As a second example, suppose Bart is an investor who makes risk-averse decisions (say, invests in bonds rather than stocks). He might do this for two reasons:
      He would get a lot of disutility from losing money (maybe it’s his retirement fund)
      He irrationally believes the probability of losing money is higher than it actually is (maybe he is biased because he grew up during a financial crash).
      These different combinations of probability and utility inform the same risk-averse behavior. In fact, probability and utility are so interchangeable that professional traders—just about the most calibrated, rational people with regard to probability of losing money, and who are only risk-averse for reason (1) -- often model financial products as if losing money is more likely than it actually is, because it makes the math easier.
      What links here?
      Utility functions and probabilities are entangled by Thomas Kwa (LessWrong; 26 Jul 2022 5:36 UTC; 15 points)
      - Vaidehi Agarwalla 🔸 25 May 2022 0:35 UTC
        13 points
        0 ∶ 0
        Parent
        Thanks this is helpful, and potentially a useful top-level post
- Marisa 24 May 2022 2:31 UTC
  4 points
  0 ∶ 0
  Parent
  Very valid! I guess I’m thinking of this as “approaches EA values” [verb] rather than “values” [noun]. I think most if not all of the most abstract values EA holds are still in place, but the distinction between core and secondary values is important.
  - Vaidehi Agarwalla 🔸 24 May 2022 17:33 UTC
    9 points
    0 ∶ 0
    Parent
    This was mainly a linguistic comment because I find that sometimes people disagree with a post if the terminology used is wrong, so I wanted to get ahead of that. I think I probably could have been more clear that I think you’ve identified something important and true here, and I am somewhat concerned about how memes spread and wouldn’t want people who haven’t updated along those lines to feel less like they are part of the EA community.
- ElliotJDavies 24 May 2022 8:57 UTC
  2 points
  0 ∶ 0
  Parent
  I agree with this goal hierarchy framework—it’s super super useful to appreciate that many of one’s personal goals are just extrapolations and mental shortcuts of more distilled upstream goals