Hang on, the category/example you cite is listed in the ‘Recommended use of LLMs’ section. So, I’m not sure what you’re disagreeing with?
Indeed, almost half the post is about distinguishing good from bad uses of LLMs, thus I’m struggling to make sense of your last paragraph. Are you referring to discussion (which demonizes all AI use for writing) that has happened elsewhere?
Thanks for raising this. I agree the setup isn’t ideal, but the listed judges are the people who combine sufficiently good judgement + relevant expertise and—crucially—willingness to put in the time. (I did try to enlist other experts, but it didn’t come together.)
To clarify, the essay panel isn’t just the author plus a collaborator—it’s three people: Anthony (who wrote the sequence), Jesse (who collaborated on related work), and Andreas Mogensen, an Oxford philosopher who has previously worked on cluelessness but who had no involvement in the sequence and is independent of it. I’m not sure this fully dissolves the biases/incentives concern, but I hope it goes some way toward doing so.