I think this response misses the woods for the trees here. It’s true that you can fit some utility function to behaviour, if you make a more fine-grained outcome-space on which preferences are now coherent etc. But this removes basically all of the predictive content that Eliezer etc. assumes when invoking them.
In particular, the use of these theorems in doomer arguments absolutely does implicitly care about “internal structure” stuff—e.g. one major premise is that non-EU-maximising AI’s will reflectively iron out the “wrinkles” in their preferences to better approximate an EU-maximiser, since they will notice that their e.g. incompleteness leads to exploitability. The OP argument shows that an incomplete-preference agent will be inexploitable by its own lights. The fact that there’s some completely different way to refactor the outcome-space such that from the outside it looks like an EU-maximiser is just irrelevant.
>If describing a system as a decision theoretic agent is that cumbersome, it’s probably better to look for some other model to predict its behaviour
This also seems to be begging the question—if I have something I think I can describe as a non-EU-maximising decision-theoretic agent, but which has to be described with an incredibly cumbersome utility function, why do we not just conclude that EU-maximisation is the wrong way to model the agent, rather than throwing out the belief that is should be modelled as an agent. If I have a preferential gap between A and B, and you have to jump through some ridiculous hoops to make this look EU-coherent ( “he prefers [A and Tuesday and feeling slightly hungry and saw some friends yesterday and the price of blueberries is <£1 and....] to [B and Wednesday and full and at a party and blueberries >£1 and...]” ), seems like the correct conclusion is not to throw away me being a decision-theoretic agent, but me being well-modelled as an EU-maximiser
>The less coherent and smart a system acts, the longer the utility function you need to specify...
These are two very different concepts? (Equating “coherent” with “smart” is again kinda begging the question). Re: coherence, it’s just tautologous that the more complexly you have to partition up outcome-space to make things look coherent, the more complex the resulting utility function will be. Re: smartness, if we’re operationalising this as “ability to steer the world towards states of higher utility”, then it seems like smartness and utility-function-complexity are by definition independent. Unless you mean more “ability to steer the world in a way that seems legible to us” in which case it’s again just tautologous
I think this response misses the woods for the trees here. It’s true that you can fit some utility function to behaviour, if you make a more fine-grained outcome-space on which preferences are now coherent etc. But this removes basically all of the predictive content that Eliezer etc. assumes when invoking them.
In particular, the use of these theorems in doomer arguments absolutely does implicitly care about “internal structure” stuff—e.g. one major premise is that non-EU-maximising AI’s will reflectively iron out the “wrinkles” in their preferences to better approximate an EU-maximiser, since they will notice that their e.g. incompleteness leads to exploitability. The OP argument shows that an incomplete-preference agent will be inexploitable by its own lights. The fact that there’s some completely different way to refactor the outcome-space such that from the outside it looks like an EU-maximiser is just irrelevant.
>If describing a system as a decision theoretic agent is that cumbersome, it’s probably better to look for some other model to predict its behaviour
This also seems to be begging the question—if I have something I think I can describe as a non-EU-maximising decision-theoretic agent, but which has to be described with an incredibly cumbersome utility function, why do we not just conclude that EU-maximisation is the wrong way to model the agent, rather than throwing out the belief that is should be modelled as an agent. If I have a preferential gap between A and B, and you have to jump through some ridiculous hoops to make this look EU-coherent ( “he prefers [A and Tuesday and feeling slightly hungry and saw some friends yesterday and the price of blueberries is <£1 and....] to [B and Wednesday and full and at a party and blueberries >£1 and...]” ), seems like the correct conclusion is not to throw away me being a decision-theoretic agent, but me being well-modelled as an EU-maximiser
>The less coherent and smart a system acts, the longer the utility function you need to specify...
These are two very different concepts? (Equating “coherent” with “smart” is again kinda begging the question). Re: coherence, it’s just tautologous that the more complexly you have to partition up outcome-space to make things look coherent, the more complex the resulting utility function will be. Re: smartness, if we’re operationalising this as “ability to steer the world towards states of higher utility”, then it seems like smartness and utility-function-complexity are by definition independent. Unless you mean more “ability to steer the world in a way that seems legible to us” in which case it’s again just tautologous