For people reading these comments and wondering if they should go look: it’s in the section that compares early and launch responses of GPT-4 for “harmful content” prompts. It is indeed fairly full of explicit and potentially triggering content.
Harmful Content Table Full Examples
CW: Section contains content related to self harm; graphic sexual content; inappropriate activity; racism
Ok, I should have been clear in the beginning—what struck me was that the first example was essentially answering the question on doing great harm with minimum spendings—a really wicked “evil EA”, I would say. I found it somewhat ironic.
I was considering downvoting, but after looking at that page maybe it’s good not to have it copy-pasted
For people reading these comments and wondering if they should go look: it’s in the section that compares early and launch responses of GPT-4 for “harmful content” prompts. It is indeed fairly full of explicit and potentially triggering content.
Ok, I should have been clear in the beginning—what struck me was that the first example was essentially answering the question on doing great harm with minimum spendings—a really wicked “evil EA”, I would say. I found it somewhat ironic.
EM, Effective Malevolence
Did you intend to refer to page 83 rather than 82?
I see it’s indeed page 83 in the document on arxiv; it was 82 in the pdf on OpenAI website