Interesting to think about what governance the newsletter should have in place re: info hazards, confidentiality, etc.
Currently we only write about public documents, so I don’t think these concerns arise. I suppose you could imagine that someone writes about something they shouldn’t have and we amplify it, but I suspect this is a rare case and one that should be up to my discretion.
What did you guys do for GPT-2?
Not sure what specifically you’re asking about here. You can see the relevant newsletter here.
… you could imagine that someone writes about something they shouldn’t have and we amplify it, but I suspect this is a rare case and one that should be up to my discretion.
A crux here is probably how rare a case we think this is.
From my present vantage, the AI alignment newsletter is becoming a pretty prominent clearinghouse for academic AI alignment research updates. (I wouldn’t be surprised if it were the primary source of such for a sizable portion of newsletter subscribers.)
To the extent that’s true, the amplification effects seem possibly strong.
From my present vantage, the AI alignment newsletter is becoming a pretty prominent clearinghouse for academic AI alignment research updates. (I wouldn’t be surprised if it were the primary source of such for a sizable portion of newsletter subscribers.)
To the extent that’s true, the amplification effects seem possibly strong.
I agree that’s true and that the amplification effects for AI safety researchers are strong; it’s much less strong of an amplification effect for any other category. My current model is that info hazards are most worrisome when they spread outside the AI safety community.
On confidentiality, the downsides of the newsletter failing to preserve confidentiality seem sufficiently small that I’m not worried (if you ignore info hazards). Failures of confidentiality seem bad in that they harm your reputation and make it less likely that people are willing to talk to you—it’s similar to the reason you wouldn’t break a promise even if superficially the consequences of the thing you’re doing seem slightly negative. But in the case of the newsletter, we would amplify someone else’s failure to preserve confidentiality, which shouldn’t reflect all that poorly on us. (Obviously if we knew that the information was supposed to be confidential we wouldn’t publish it.)
Comment thread for the question: Am I underestimating the risk of causing information cascades? Regardless, how can I mitigate this risk?
Seems okay so far, from my very ill-informed perspective.
Interesting to think about what governance the newsletter should have in place re: info hazards, confidentiality, etc.
What did you guys do for GPT-2?
Currently we only write about public documents, so I don’t think these concerns arise. I suppose you could imagine that someone writes about something they shouldn’t have and we amplify it, but I suspect this is a rare case and one that should be up to my discretion.
Not sure what specifically you’re asking about here. You can see the relevant newsletter here.
A crux here is probably how rare a case we think this is.
From my present vantage, the AI alignment newsletter is becoming a pretty prominent clearinghouse for academic AI alignment research updates. (I wouldn’t be surprised if it were the primary source of such for a sizable portion of newsletter subscribers.)
To the extent that’s true, the amplification effects seem possibly strong.
I agree that’s true and that the amplification effects for AI safety researchers are strong; it’s much less strong of an amplification effect for any other category. My current model is that info hazards are most worrisome when they spread outside the AI safety community.
On confidentiality, the downsides of the newsletter failing to preserve confidentiality seem sufficiently small that I’m not worried (if you ignore info hazards). Failures of confidentiality seem bad in that they harm your reputation and make it less likely that people are willing to talk to you—it’s similar to the reason you wouldn’t break a promise even if superficially the consequences of the thing you’re doing seem slightly negative. But in the case of the newsletter, we would amplify someone else’s failure to preserve confidentiality, which shouldn’t reflect all that poorly on us. (Obviously if we knew that the information was supposed to be confidential we wouldn’t publish it.)