Jobst Heitzig (vodle.it) comments on My lab’s small AI safety agenda

Jobst Heitzig (vodle.it) 20 Jun 2023 11:05 UTC
2 points
1 ∶ 0
That depends what you mean by “continuously improving until you reach a limit which is not necessarily the global limit”.
I guess by “continuously” you probably do not mean “in continuous time” but rather “repeatedly in discrete time steps”? So you imagine a sequence r(s1) < r(s2) < … ? Well, that could converge to anything larger than each of the r(sn). E.g., if r(sn) = 1 − 1/n, it will converge to 1. (It will of course never “reach” 1 since it will always below 1.) This is completely independent of what the local or global maxima of r are. They can obviously be way larger. For example, if the function is r(s) = s and the sequence is sn = 1 − 1/n, then r(sn) converges to 1 but the maximum of r is infinity. So, as I said before, unless your sequence of improvements is part of an attempt to find a maximum (that is, part of an optimization process), there is no reason to expect that it will converge to some maximum.
Btw., this also shows that if you have two competing satisficers whose only goal is to outperform the other and who therefore repeatedly improve their reward to be larger than the other agents’ current reward, this does not imply that their rewards will converge to some maximum reward. They can easily be programmed to avoid this by just outperforming the other by an amount of 2**(-n) in the n-th step, so that their rewards converge to the initial reward plus one, rather than to whatever maximum reward might be possible.
- titotal 20 Jun 2023 14:01 UTC
  2 points
  0 ∶ 0
  Parent
  Ah, well explained, thank you. Yes, I agree now that you can theoretically improve to a limit without having that limit being a local maxima. Although I’m unsure if the procedure could end up being equivalent in practice to a local maximisation with a modified goal function (say one that penalises going above “reward + 1” with exponential cost). Maybe something to think about when going forward.
  Thanks for answering the questions, best of luck with the endeavour!