Am in agreement with most of your post, except for one thing: calling these changes to our values.
The following is the beginnings of a chain of thinking that isnât fully fleshed out, but I hope is useful. All word choices are probably suboptimal. I donât hold the implications of these views very strongly or at all, Iâm mostly trying to puzzle things out and provide arguments I think are somewhat strong or compelling.
All the things you mention donât seem like values to meâthey seem more like strategies or approaches to doing good (which
âCoreâ values are things like truth-seekingness, epistemic humility or maximizing impact or something, whereas for example âcause neutralityâ and by extension âlongtermismâ are downstream of that.
But we also have âsecondary valuesâ (terrible wording) which are influenced by our core values and our worldview and specific (cognitive) beliefs about how the world works (this influence each other but are somewhat independent).
I can see a version of EA where the core values â longtermism chain becomes replaced with just longtermism as a default (just like in current EA we take the core values â helping people in developing countries chain is something of a defaultâI donât think itâs very often that people come into EA strongly opposing this value setâthis isnât a bad thingâthese are the low hanging fruit).
Why are core & secondary values important to distinguish?
People who are on board with the changes do not see the shared values as conflicting with the core values they see it as a natural progression of core values. Just like we thought that âeveryone mattersâ leads to âdonate to help improve the lives of poor people in developing countriesâ so too is the connection between âeveryone mattersâ to âfuture people should be our priorityâ.
Implication: people reading this post may say âthis isnât value driftâ
I think are core values are really important and the real glue of our community, a glue that will withstand the test of time and ideally let us adapt, change and grow as we get new information and we update our beliefs.
Maybe this is to idealistic, and in practice simply saying âbut we share the same core valuesâ even if true, is simply not enough.
In practice, the level of secondary values can be more useful: maybe technical AI safety researchers and farmed animal welfare advocates just donât have that much in common or the inferential distance is a bit too much in terms of their models of the world, impact, risk aversion etc. etc.
While I understand the point youâre making, the comment you linked is (to my non-STEM mind) pretty hard to parse. Would you be able to give a less technical, more ELI5 explanation?
Suppose that there are two billionaires, April and Autumn. Originally they were funding AMF because they thought working on AI alignment would be 0.01% likely to work and solving alignment would be as good as saving 10 billion lives, which is an expected value of 1 million lives, lower than you could get by funding AMF.
After being in the EA community a while they switched to funding alignment research for different reasons.
April updated upwards on tractability. She thinks research on AI alignment is 10% likely to work, and solving alignment is as good as saving 10 billion lives.
Autumn now buys longtermist moral arguments. Autumn thinks research on AI alignment is 0.01% likely to work, and solving alignment is as good as saving 10 trillion lives.
Both of them assign the same expected utility to alignment-- 1 billion lives. As such they will make the same decisions. So even though April made an epistemic update and Autumn a moral update, we cannot distinguish them from behavior alone.
This extends to a general principle: actions are driven by a combination of your values and subjective probabilities, and any given action is consistent with many different combinations of utility function and probability distribution.
As a second example, suppose Bart is an investor who makes risk-averse decisions (say, invests in bonds rather than stocks). He might do this for two reasons:
He would get a lot of disutility from losing money (maybe itâs his retirement fund)
He irrationally believes the probability of losing money is higher than it actually is (maybe he is biased because he grew up during a financial crash).
These different combinations of probability and utility inform the same risk-averse behavior. In fact, probability and utility are so interchangeable that professional tradersâjust about the most calibrated, rational people with regard to probability of losing money, and who are only risk-averse for reason (1) -- often model financial products as if losing money is more likely than it actually is, because it makes the math easier.
Very valid! I guess Iâm thinking of this as âapproaches EA valuesâ [verb] rather than âvaluesâ [noun]. I think most if not all of the most abstract values EA holds are still in place, but the distinction between core and secondary values is important.
This was mainly a linguistic comment because I find that sometimes people disagree with a post if the terminology used is wrong, so I wanted to get ahead of that. I think I probably could have been more clear that I think youâve identified something important and true here, and I am somewhat concerned about how memes spread and wouldnât want people who havenât updated along those lines to feel less like they are part of the EA community.
I agree with this goal hierarchy frameworkâitâs super super useful to appreciate that many of oneâs personal goals are just extrapolations and mental shortcuts of more distilled upstream goals
Am in agreement with most of your post, except for one thing: calling these changes to our values.
The following is the beginnings of a chain of thinking that isnât fully fleshed out, but I hope is useful. All word choices are probably suboptimal. I donât hold the implications of these views very strongly or at all, Iâm mostly trying to puzzle things out and provide arguments I think are somewhat strong or compelling.
All the things you mention donât seem like values to meâthey seem more like strategies or approaches to doing good (which
âCoreâ values are things like truth-seekingness, epistemic humility or maximizing impact or something, whereas for example âcause neutralityâ and by extension âlongtermismâ are downstream of that.
But we also have âsecondary valuesâ (terrible wording) which are influenced by our core values and our worldview and specific (cognitive) beliefs about how the world works (this influence each other but are somewhat independent).
I can see a version of EA where the core values â longtermism chain becomes replaced with just longtermism as a default (just like in current EA we take the core values â helping people in developing countries chain is something of a defaultâI donât think itâs very often that people come into EA strongly opposing this value setâthis isnât a bad thingâthese are the low hanging fruit).
Why are core & secondary values important to distinguish?
People who are on board with the changes do not see the shared values as conflicting with the core values they see it as a natural progression of core values. Just like we thought that âeveryone mattersâ leads to âdonate to help improve the lives of poor people in developing countriesâ so too is the connection between âeveryone mattersâ to âfuture people should be our priorityâ.
Implication: people reading this post may say âthis isnât value driftâ
I think are core values are really important and the real glue of our community, a glue that will withstand the test of time and ideally let us adapt, change and grow as we get new information and we update our beliefs.
Maybe this is to idealistic, and in practice simply saying âbut we share the same core valuesâ even if true, is simply not enough.
In practice, the level of secondary values can be more useful: maybe technical AI safety researchers and farmed animal welfare advocates just donât have that much in common or the inferential distance is a bit too much in terms of their models of the world, impact, risk aversion etc. etc.
Maybe related is that even for ideal expected utility maximizers, values and subjective probabilities are impossible to disentangle by observing behavior. So itâs not always easy to tell what changes are value drift vs epistemic updates.
While I understand the point youâre making, the comment you linked is (to my non-STEM mind) pretty hard to parse. Would you be able to give a less technical, more ELI5 explanation?
Sure, hereâs the ELI12:
Suppose that there are two billionaires, April and Autumn. Originally they were funding AMF because they thought working on AI alignment would be 0.01% likely to work and solving alignment would be as good as saving 10 billion lives, which is an expected value of 1 million lives, lower than you could get by funding AMF.
After being in the EA community a while they switched to funding alignment research for different reasons.
April updated upwards on tractability. She thinks research on AI alignment is 10% likely to work, and solving alignment is as good as saving 10 billion lives.
Autumn now buys longtermist moral arguments. Autumn thinks research on AI alignment is 0.01% likely to work, and solving alignment is as good as saving 10 trillion lives.
Both of them assign the same expected utility to alignment-- 1 billion lives. As such they will make the same decisions. So even though April made an epistemic update and Autumn a moral update, we cannot distinguish them from behavior alone.
This extends to a general principle: actions are driven by a combination of your values and subjective probabilities, and any given action is consistent with many different combinations of utility function and probability distribution.
As a second example, suppose Bart is an investor who makes risk-averse decisions (say, invests in bonds rather than stocks). He might do this for two reasons:
He would get a lot of disutility from losing money (maybe itâs his retirement fund)
He irrationally believes the probability of losing money is higher than it actually is (maybe he is biased because he grew up during a financial crash).
These different combinations of probability and utility inform the same risk-averse behavior. In fact, probability and utility are so interchangeable that professional tradersâjust about the most calibrated, rational people with regard to probability of losing money, and who are only risk-averse for reason (1) -- often model financial products as if losing money is more likely than it actually is, because it makes the math easier.
Thanks this is helpful, and potentially a useful top-level post
Very valid! I guess Iâm thinking of this as âapproaches EA valuesâ [verb] rather than âvaluesâ [noun]. I think most if not all of the most abstract values EA holds are still in place, but the distinction between core and secondary values is important.
This was mainly a linguistic comment because I find that sometimes people disagree with a post if the terminology used is wrong, so I wanted to get ahead of that. I think I probably could have been more clear that I think youâve identified something important and true here, and I am somewhat concerned about how memes spread and wouldnât want people who havenât updated along those lines to feel less like they are part of the EA community.
I agree with this goal hierarchy frameworkâitâs super super useful to appreciate that many of oneâs personal goals are just extrapolations and mental shortcuts of more distilled upstream goals