Lately, I tend to think of this as a distinction between the “proxy optmization” algorithm and the “optimality” of the actual plan. The algorithm: specify a proxy reward and a proxy set of plans, and search for the best one. You could call this “proxy optimization”.
The results: whatever actually happens, and how good it actually is. There’s not really a verb associated with this—you can’t just make something as good as it can possibly be (not even “in expectation”—you can only optimize proxies in expectation!). But it still seems like there’s a loose sense in which you can be aiming for optimality.
Off the top of my head, there are a few ways proxy optimization can hurt, and most of them seem to come down to “better optimizing a worse proxy”. You could deliberately alter the problem so that it is tractable for proxy optimization, you could just invest too much in proxy optimization vs trying to construct a good proxy. This seems to roughly agree with your advice: investing lots in proxy optimization is particularly beneficial when the proxy is already pretty good, or when it will reveal very large differences in prospective plans (which are unlikely to be erased by considering a better proxy). I actually feel that some caution might be needed in the setting where there are apparently many orders of magnitude between the value of different plans (according to a proxy) - something like, if the system is apparently so sensitive to the things you are taking into account, then there’s reason to believe it might also be quite sensitive to the things you’re not taking into account.
Lately, I tend to think of this as a distinction between the “proxy optmization” algorithm and the “optimality” of the actual plan. The algorithm: specify a proxy reward and a proxy set of plans, and search for the best one. You could call this “proxy optimization”.
The results: whatever actually happens, and how good it actually is. There’s not really a verb associated with this—you can’t just make something as good as it can possibly be (not even “in expectation”—you can only optimize proxies in expectation!). But it still seems like there’s a loose sense in which you can be aiming for optimality.
Off the top of my head, there are a few ways proxy optimization can hurt, and most of them seem to come down to “better optimizing a worse proxy”. You could deliberately alter the problem so that it is tractable for proxy optimization, you could just invest too much in proxy optimization vs trying to construct a good proxy. This seems to roughly agree with your advice: investing lots in proxy optimization is particularly beneficial when the proxy is already pretty good, or when it will reveal very large differences in prospective plans (which are unlikely to be erased by considering a better proxy). I actually feel that some caution might be needed in the setting where there are apparently many orders of magnitude between the value of different plans (according to a proxy) - something like, if the system is apparently so sensitive to the things you are taking into account, then there’s reason to believe it might also be quite sensitive to the things you’re not taking into account.