Also, minor, but I’m not sure that the first sentence in the flag is grammatically correct, which led me to think I might need to add a disclosure to be compliant with your policy, as it was flagged, even if I haven’t used any LLMs for editing/writing this post.
Just to clarify, are you responding to the “Likely that more than 10% of your post was drafted by an LLM?” message? That’s now always shown at the top of the editor, to notify users about our new policy. So that’s not related to the body of your post. :) We do currently use Pangram but the results are only for internal use, and don’t affect the editor.
I can see why that would be confusing, so I’ll update the wording to be more clearly a question.
Oh lol, I haven’t written a post in a while :’). I think it would have been much more obvious if I had seen it for the first time in a blank editor, which is presumably how much people see it for the first time.
Ah interesting that you read it that way. That reminder shows for everybody, not for people who have been flagged. I’ll think about changing the text.
However, we are already planning on reviewing this policy as soon as we have time (likely post-EAG). Specifically, we might do what you already assumed we were doing, i.e. set up an automated system based on pangram. One of my biggest cruxes is just how reliable pangram is, so let me know if you have takes.
My impression is that Pangram has very low false positive rates and unclear-to-me false negative rates—so I’d suggest using it to rule things in as AI-generated, but not strongly rule them out.
Also, minor, but I’m not sure that the first sentence in the flag is grammatically correct, which led me to think I might need to add a disclosure to be compliant with your policy, as it was flagged, even if I haven’t used any LLMs for editing/writing this post.
Just to clarify, are you responding to the “Likely that more than 10% of your post was drafted by an LLM?” message? That’s now always shown at the top of the editor, to notify users about our new policy. So that’s not related to the body of your post. :) We do currently use Pangram but the results are only for internal use, and don’t affect the editor.
I can see why that would be confusing, so I’ll update the wording to be more clearly a question.
Oh lol, I haven’t written a post in a while :’). I think it would have been much more obvious if I had seen it for the first time in a blank editor, which is presumably how much people see it for the first time.
Ah interesting that you read it that way. That reminder shows for everybody, not for people who have been flagged. I’ll think about changing the text.
However, we are already planning on reviewing this policy as soon as we have time (likely post-EAG). Specifically, we might do what you already assumed we were doing, i.e. set up an automated system based on pangram. One of my biggest cruxes is just how reliable pangram is, so let me know if you have takes.
My impression is that Pangram has very low false positive rates and unclear-to-me false negative rates—so I’d suggest using it to rule things in as AI-generated, but not strongly rule them out.
A little related discussion here https://x.com/caleb_parikh/status/2035434186417262863