In our work at QURI, this is important because we want to programmatically encode utility functions and use them directly for decision-making
I find this ambition a little concerning, but it could be that I’m reading it wrong. In my mind, the most dangerous possible design for an AI (or an organisation) would be a fixed goal utility function maximiser. For explanations of why, see this post, this post, and this post. Such an entity could pursue it’s “number go up” until the destruction of the universe. Current AI’s do not fit this design, so why would you seek to change that?
Good question! As I tried describing in this post (though I really didn’t get in the details of specific QURI plans), there are many sorts of utility functions you can use, and many ways of optimizing over them.
Using some of my terminology above, I think a lot of people here think of advanced AIs as applying a highly prescriptive, deliberation-extrapolated utility function, with a great deal of optimization power, particularly in situations where there’s very little ability to account for utility-function uncertainty. I agree that this is scary and a bad idea, especially in situations where we have little experience in learning to optimize for any sort of explicit utility function.
But again, utility functions and their optimization can be far more harmless than this.
Very simple examples of (partial) utility functions include:
For each animal/being, how much should EA donors value one of their life-years?
For each of [set of EA projects], how valuable did it seem to be?
For each of [a long list of potential life hacks], what is the expected value?
I believe these lists can satisfy the core tenants of Von Neumann–Morgenstern utility functions, but I don’t think that many people here would consider “trying to make these lists using reasonable measures, and generally then taking decisions based on them” to be particularly scary or controversial.
In the limit, I could imagine people saying, “I think that prescriptive, deliberation-extrapolated utility function, with a great deal of optimization power is scary, so we should never optimize or improve anything that’s technically any sort of utility function. Therefore, no more cost-benefit analysis, no more rankings of things, etc.” I think this mistake would be highly unfortunate, and attempted to clarify some aspects of it in this post.
Reading closer, I would separately note that I think there is some semantic ambiguity in how you and others describe extreme optimizers.
I think that an agent that’s “intensely maximizing for a goal that can be put into numbers in order to show that it’s optimal” can still be incredibly humble and reserved.
Holden writes, “Can we avoid these pitfalls by “just maximizing correctly?” and basically answers no, but his alternative proposal is to “apply a broad sense of pluralism and moderation to much of what they do”.
I think that very arguably, Holden is basically saying, “The utility we’d get from executing [pluralism and moderation] strategy is greater than we would be executing [naive narrow optimization] strategy, so we should pursue the former”. To me, this can easily be understood as a form of “utility optimization over utility optimization strategies.” So Holden’s resulting strategy can still be considered utility optimization, in my opinion.
I find this ambition a little concerning, but it could be that I’m reading it wrong. In my mind, the most dangerous possible design for an AI (or an organisation) would be a fixed goal utility function maximiser. For explanations of why, see this post, this post, and this post. Such an entity could pursue it’s “number go up” until the destruction of the universe. Current AI’s do not fit this design, so why would you seek to change that?
Am I misunderstanding this completely?
Good question! As I tried describing in this post (though I really didn’t get in the details of specific QURI plans), there are many sorts of utility functions you can use, and many ways of optimizing over them.
Using some of my terminology above, I think a lot of people here think of advanced AIs as applying a highly prescriptive, deliberation-extrapolated utility function, with a great deal of optimization power, particularly in situations where there’s very little ability to account for utility-function uncertainty. I agree that this is scary and a bad idea, especially in situations where we have little experience in learning to optimize for any sort of explicit utility function.
But again, utility functions and their optimization can be far more harmless than this.
Very simple examples of (partial) utility functions include:
For each animal/being, how much should EA donors value one of their life-years?
For each of [set of EA projects], how valuable did it seem to be?
For each of [a long list of potential life hacks], what is the expected value?
I believe these lists can satisfy the core tenants of Von Neumann–Morgenstern utility functions, but I don’t think that many people here would consider “trying to make these lists using reasonable measures, and generally then taking decisions based on them” to be particularly scary or controversial.
In the limit, I could imagine people saying, “I think that prescriptive, deliberation-extrapolated utility function, with a great deal of optimization power is scary, so we should never optimize or improve anything that’s technically any sort of utility function. Therefore, no more cost-benefit analysis, no more rankings of things, etc.” I think this mistake would be highly unfortunate, and attempted to clarify some aspects of it in this post.
Reading closer, I would separately note that I think there is some semantic ambiguity in how you and others describe extreme optimizers.
I think that an agent that’s “intensely maximizing for a goal that can be put into numbers in order to show that it’s optimal” can still be incredibly humble and reserved.
Holden writes, “Can we avoid these pitfalls by “just maximizing correctly?” and basically answers no, but his alternative proposal is to “apply a broad sense of pluralism and moderation to much of what they do”.
I think that very arguably, Holden is basically saying, “The utility we’d get from executing [pluralism and moderation] strategy is greater than we would be executing [naive narrow optimization] strategy, so we should pursue the former”. To me, this can easily be understood as a form of “utility optimization over utility optimization strategies.” So Holden’s resulting strategy can still be considered utility optimization, in my opinion.