Setting aside what this post said, here’s an attitude I think we should be sympathetic to:
There are possible futures that are great by prosaic standards, where all humans are flourishing and so forth. But some of these futures may not be great by the standards that everyone would adopt if we were smarter, wiser, better-informed, and so forth (which the author happens to believe is utilitarianism). Insofar as the latter is much more choice-worthy in expectation than the former, we should have great concern for not just ensuring survival, but also that good values are realized in the future. This may require some events happening, or some events happening before others, or some specific coordination, to achieve. Phrased more provocatively, superintelligence aligned with normal human values is a prima facie existential catastrophe, since normal human values probably aren’t really good, or aren’t what we would be promoting if we were wiser/etc. I’m not sure the Schelling point note is relevant—it depends on which agents are coordinating on AI—but if it is, a better Schelling point may be some kind of extrapolation of human values.
Edit: ok, I agree we should be cautious about acting certain in utilitarianism or whatever we may happen to value when those-with-whom-we-should-cooperate disagree.
Yes, I agree with that. I think aiming for some sort of CEV-like system to find such values in the future, via some robustly-not-value-degrading process, seems like a good idea. Hopefully such a process could gain widespread assent. It’s the jumping straight to the (perceived) conclusion I am objecting to.
Setting aside what this post said, here’s an attitude I think we should be sympathetic to:
There are possible futures that are great by prosaic standards, where all humans are flourishing and so forth. But some of these futures may not be great by the standards that everyone would adopt if we were smarter, wiser, better-informed, and so forth (which the author happens to believe is utilitarianism). Insofar as the latter is much more choice-worthy in expectation than the former, we should have great concern for not just ensuring survival, but also that good values are realized in the future. This may require some events happening, or some events happening before others, or some specific coordination, to achieve. Phrased more provocatively, superintelligence aligned with normal human values is a prima facie existential catastrophe, since normal human values probably aren’t really good, or aren’t what we would be promoting if we were wiser/etc. I’m not sure the Schelling point note is relevant—it depends on which agents are coordinating on AI—but if it is, a better Schelling point may be some kind of extrapolation of human values.
Edit: ok, I agree we should be cautious about acting certain in utilitarianism or whatever we may happen to value when those-with-whom-we-should-cooperate disagree.
Yes, I agree with that. I think aiming for some sort of CEV-like system to find such values in the future, via some robustly-not-value-degrading process, seems like a good idea. Hopefully such a process could gain widespread assent. It’s the jumping straight to the (perceived) conclusion I am objecting to.