Suppose that there are two billionaires, April and Autumn. Originally they were funding AMF because they thought working on AI alignment would be 0.01% likely to work and solving alignment would be as good as saving 10 billion lives, which is an expected value of 1 million lives, lower than you could get by funding AMF.
After being in the EA community a while they switched to funding alignment research for different reasons.
April updated upwards on tractability. She thinks research on AI alignment is 10% likely to work, and solving alignment is as good as saving 10 billion lives.
Autumn now buys longtermist moral arguments. Autumn thinks research on AI alignment is 0.01% likely to work, and solving alignment is as good as saving 10 trillion lives.
Both of them assign the same expected utility to alignment-- 1 billion lives. As such they will make the same decisions. So even though April made an epistemic update and Autumn a moral update, we cannot distinguish them from behavior alone.
This extends to a general principle: actions are driven by a combination of your values and subjective probabilities, and any given action is consistent with many different combinations of utility function and probability distribution.
As a second example, suppose Bart is an investor who makes risk-averse decisions (say, invests in bonds rather than stocks). He might do this for two reasons:
He would get a lot of disutility from losing money (maybe it’s his retirement fund)
He irrationally believes the probability of losing money is higher than it actually is (maybe he is biased because he grew up during a financial crash).
These different combinations of probability and utility inform the same risk-averse behavior. In fact, probability and utility are so interchangeable that professional traders—just about the most calibrated, rational people with regard to probability of losing money, and who are only risk-averse for reason (1) -- often model financial products as if losing money is more likely than it actually is, because it makes the math easier.
Sure, here’s the ELI12:
Suppose that there are two billionaires, April and Autumn. Originally they were funding AMF because they thought working on AI alignment would be 0.01% likely to work and solving alignment would be as good as saving 10 billion lives, which is an expected value of 1 million lives, lower than you could get by funding AMF.
After being in the EA community a while they switched to funding alignment research for different reasons.
April updated upwards on tractability. She thinks research on AI alignment is 10% likely to work, and solving alignment is as good as saving 10 billion lives.
Autumn now buys longtermist moral arguments. Autumn thinks research on AI alignment is 0.01% likely to work, and solving alignment is as good as saving 10 trillion lives.
Both of them assign the same expected utility to alignment-- 1 billion lives. As such they will make the same decisions. So even though April made an epistemic update and Autumn a moral update, we cannot distinguish them from behavior alone.
This extends to a general principle: actions are driven by a combination of your values and subjective probabilities, and any given action is consistent with many different combinations of utility function and probability distribution.
As a second example, suppose Bart is an investor who makes risk-averse decisions (say, invests in bonds rather than stocks). He might do this for two reasons:
He would get a lot of disutility from losing money (maybe it’s his retirement fund)
He irrationally believes the probability of losing money is higher than it actually is (maybe he is biased because he grew up during a financial crash).
These different combinations of probability and utility inform the same risk-averse behavior. In fact, probability and utility are so interchangeable that professional traders—just about the most calibrated, rational people with regard to probability of losing money, and who are only risk-averse for reason (1) -- often model financial products as if losing money is more likely than it actually is, because it makes the math easier.
Thanks this is helpful, and potentially a useful top-level post