I would rather not. This would pressure people into goodharting their projects for legibility, which is one of the things our setup is supposed to prevent.
(tldr: an agent is legible if a principal can easily monitor them, but it limits their options to what is easy for the principal to measure, which might reduce performance)
Quite a few of our guests are not even on this list, but this doesn’t mean they’re sitting around doing nothing all day. They’re doing illegible work that is hard or even impossible to evaluate at a distance. I put a few examples in the second caveat of the post.
(I realise this is at odds with the EA maxim of measuring outcomes. That’s why we published this post: so the hotel could at least be evaluated in aggregate. I think it’s neat that people with illegible work can hide behind legible ones)
I would rather not. This would pressure people into goodharting their projects for legibility, which is one of the things our setup is supposed to prevent.
(tldr: an agent is legible if a principal can easily monitor them, but it limits their options to what is easy for the principal to measure, which might reduce performance)
Quite a few of our guests are not even on this list, but this doesn’t mean they’re sitting around doing nothing all day. They’re doing illegible work that is hard or even impossible to evaluate at a distance. I put a few examples in the second caveat of the post.
(I realise this is at odds with the EA maxim of measuring outcomes. That’s why we published this post: so the hotel could at least be evaluated in aggregate. I think it’s neat that people with illegible work can hide behind legible ones)