The AI safety community has gotten people to do reinforcement learning from human feedback (rather than automated reward functions) sooner than it would otherwise have happened.
There’s lots of subtleties about whether this reduced x-risk or not but I think it did.
The AI safety community has gotten people to do reinforcement learning from human feedback (rather than automated reward functions) sooner than it would otherwise have happened.
There’s lots of subtleties about whether this reduced x-risk or not but I think it did.