titotal comments on My lab’s small AI safety agenda

titotal 20 Jun 2023 14:01 UTC
3 points
0 ∶ 0
Ah, well explained, thank you. Yes, I agree now that you can theoretically improve to a limit without having that limit being a local maxima. Although I’m unsure if the procedure could end up being equivalent in practice to a local maximisation with a modified goal function (say one that penalises going above “reward + 1” with exponential cost). Maybe something to think about when going forward.
Thanks for answering the questions, best of luck with the endeavour!