titotal comments on My lab’s small AI safety agenda

titotal 20 Jun 2023 10:44 UTC
6 points
0 ∶ 1
With regards to your improvements definition, isn’t “continuously improving until you reach a limit with is not necessarily the global limit” just a different way of describing local optimization? It sounds like you’re just describing a hill climber.
I do agree with building a satisficer, as this describes more accurately what the user actually wants! I want a cheap flight, but I wouldn’t be willing to wait 3 days for the program to find the cheapest possible flight that saved me 5 bucks. But on the other hand, if I told it to find me flights under 500 bucks, and it served me up a flight for 499 bucks even though there was another equally good option for 400 bucks, I’d be pretty annoyed.
It seems like some amount of local optimisation is necessary for an AI to be useful.
- Jobst Heitzig (vodle.it) 20 Jun 2023 11:05 UTC
  3 points
  1 ∶ 0
  Parent
  That depends what you mean by “continuously improving until you reach a limit which is not necessarily the global limit”.
  I guess by “continuously” you probably do not mean “in continuous time” but rather “repeatedly in discrete time steps”? So you imagine a sequence r(s1) < r(s2) < … ? Well, that could converge to anything larger than each of the r(sn). E.g., if r(sn) = 1 − 1/n, it will converge to 1. (It will of course never “reach” 1 since it will always below 1.) This is completely independent of what the local or global maxima of r are. They can obviously be way larger. For example, if the function is r(s) = s and the sequence is sn = 1 − 1/n, then r(sn) converges to 1 but the maximum of r is infinity. So, as I said before, unless your sequence of improvements is part of an attempt to find a maximum (that is, part of an optimization process), there is no reason to expect that it will converge to some maximum.
  Btw., this also shows that if you have two competing satisficers whose only goal is to outperform the other and who therefore repeatedly improve their reward to be larger than the other agents’ current reward, this does not imply that their rewards will converge to some maximum reward. They can easily be programmed to avoid this by just outperforming the other by an amount of 2**(-n) in the n-th step, so that their rewards converge to the initial reward plus one, rather than to whatever maximum reward might be possible.
  - titotal 20 Jun 2023 14:01 UTC
    3 points
    0 ∶ 0
    Parent
    Ah, well explained, thank you. Yes, I agree now that you can theoretically improve to a limit without having that limit being a local maxima. Although I’m unsure if the procedure could end up being equivalent in practice to a local maximisation with a modified goal function (say one that penalises going above “reward + 1” with exponential cost). Maybe something to think about when going forward.
    Thanks for answering the questions, best of luck with the endeavour!