Related, but not so much the aim of your post: who or what is going to money pump or Dutch book a superintelligence even if the superintelligence doesn’t maximize expected utility? Many money pumps and Dutch books all seem pretty contrived and unlikely to occur naturally without adversaries. So, where would the pressure to avoid them actually come from? Maybe financial markets, but do they need to generalize their aversion to exploitation in markets to all their preferences? Negotiations with humans to gain power before it kills us all? Again, do they need to generalize from these?
I guess cyclic preferences could be bad in natural/non-adversarial situations.
There are also opposite pressures: if you would have otherwise had exploitable preferences, making them non-exploitable means giving something else up, i.e. some of your preference rankings. This is also a cost, and an AGI may not be willing to pay it.
Seconded! On this note, I think the assumed presence of adversaries or competitors is actually one of the under-appreciated upshots of MIRI’s work on Logical Induction (https://intelligence.org/2016/09/12/new-paper-logical-induction/). By the logical induction criterion they propose, “good reasoning” is only defined with respect to a market of traders of a particular complexity class—which can be interpreted as saying that “good reasoning” is really intersubjective rather than objective! There’s only pressure to find the right logical beliefs in a reasonable amount of time if there are others who would fleece you for not doing so.
“good reasoning” is really intersubjective rather than objective! There’s only pressure to find the right logical beliefs in a reasonable amount of time if there are others who would fleece you for not doing so.
This is a really interesting point that reminds me of arguments made by pragmatist philosophers like John Dewey and Richard Rorty. They also wanted to make “justification” an intersubjective phenomenon, of justifying your beliefs to other people. I don’t think they had money-pump arguments in mind though.
That’s why the standard prediction is not that AIs will be perfectly coherent, but that it makes sense to model them as being sufficiently coherent in practice, in the sense that e.g. we can’t rely on incoherence in order to shut them down.
Related, but not so much the aim of your post: who or what is going to money pump or Dutch book a superintelligence even if the superintelligence doesn’t maximize expected utility? Many money pumps and Dutch books all seem pretty contrived and unlikely to occur naturally without adversaries. So, where would the pressure to avoid them actually come from? Maybe financial markets, but do they need to generalize their aversion to exploitation in markets to all their preferences? Negotiations with humans to gain power before it kills us all? Again, do they need to generalize from these?
I guess cyclic preferences could be bad in natural/non-adversarial situations.
There are also opposite pressures: if you would have otherwise had exploitable preferences, making them non-exploitable means giving something else up, i.e. some of your preference rankings. This is also a cost, and an AGI may not be willing to pay it.
Seconded! On this note, I think the assumed presence of adversaries or competitors is actually one of the under-appreciated upshots of MIRI’s work on Logical Induction (https://intelligence.org/2016/09/12/new-paper-logical-induction/). By the logical induction criterion they propose, “good reasoning” is only defined with respect to a market of traders of a particular complexity class—which can be interpreted as saying that “good reasoning” is really intersubjective rather than objective! There’s only pressure to find the right logical beliefs in a reasonable amount of time if there are others who would fleece you for not doing so.
This is a really interesting point that reminds me of arguments made by pragmatist philosophers like John Dewey and Richard Rorty. They also wanted to make “justification” an intersubjective phenomenon, of justifying your beliefs to other people. I don’t think they had money-pump arguments in mind though.
That’s why the standard prediction is not that AIs will be perfectly coherent, but that it makes sense to model them as being sufficiently coherent in practice, in the sense that e.g. we can’t rely on incoherence in order to shut them down.
I guess there are acausal influence and locally multipolar (multiple competing AGI) cases, too.