I think your cons are good things to have noted, but here are reasons why two of them might matter less than one might think:
I think the very fact that āItās possible that doing deliberate āred-teamingā would make one predisposed to spot trivial issues rather than serious ones, or falsely identify issues where there arenāt anyā could actually also make this useful for skill-building and testing fit; people will be forced to learn to avoid those failure modes, and āweā (the community, potential future hirers, etc.) can see how well they do so.
E.g., to do this red teaming well, they may have to learn to identify how central an error is to a paper/āpostās argument, to think about whether a slightly different argument could reach the same conclusion without needing the questionable premise, etc.
I have personally found that the line between ānoticing errors in existing workā and āgenerating novel researchā is pretty blurry.
A decent amount of the research Iāve done (especially some that is unfortunately nonpublic so far) has basically followed the following steps:
āThis paper/āpost/āargument seems interesting and importantā
āOh wait, it actually requires a premise that they havenāt noted and that seems questionableā /ā āIt ignores some other pathway by which a bad thing can happenā /ā āIts concepts/ādefinitions are fuzzy or conflate things in way that may obscure something importantā
[I write a post/ādoc that discusses that issue, provides some analysis in light of this additional premise being required or this other pathway being possible or whatever, and discussing what implications this hasāe.g., whether some risk is actually more or less important than we thought, or what new intervention ideas this alternative risk pathway suggests might be useful]
Iād guess that the same could also sometimes happen with this red teaming, especially if that was explicitly encouraged, people were given guidance on how to lean into this more ānovel researchā element when they notice something potentially major during the red teaming, people were given examples of how that has happened in the past, etc.
I think your cons are good things to have noted, but here are reasons why two of them might matter less than one might think:
I think the very fact that āItās possible that doing deliberate āred-teamingā would make one predisposed to spot trivial issues rather than serious ones, or falsely identify issues where there arenāt anyā could actually also make this useful for skill-building and testing fit; people will be forced to learn to avoid those failure modes, and āweā (the community, potential future hirers, etc.) can see how well they do so.
E.g., to do this red teaming well, they may have to learn to identify how central an error is to a paper/āpostās argument, to think about whether a slightly different argument could reach the same conclusion without needing the questionable premise, etc.
I have personally found that the line between ānoticing errors in existing workā and āgenerating novel researchā is pretty blurry.
A decent amount of the research Iāve done (especially some that is unfortunately nonpublic so far) has basically followed the following steps:
āThis paper/āpost/āargument seems interesting and importantā
āOh wait, it actually requires a premise that they havenāt noted and that seems questionableā /ā āIt ignores some other pathway by which a bad thing can happenā /ā āIts concepts/ādefinitions are fuzzy or conflate things in way that may obscure something importantā
[I write a post/ādoc that discusses that issue, provides some analysis in light of this additional premise being required or this other pathway being possible or whatever, and discussing what implications this hasāe.g., whether some risk is actually more or less important than we thought, or what new intervention ideas this alternative risk pathway suggests might be useful]
Off the top of my head, some useful pieces of public work by other people that I feel could be roughly described as āred teaming that turned into novel researchā include A Proposed Adjustment to the Astronomical Waste Argument and The long-term significance of reducing global catastrophic risks
Iād guess that the same could also sometimes happen with this red teaming, especially if that was explicitly encouraged, people were given guidance on how to lean into this more ānovel researchā element when they notice something potentially major during the red teaming, people were given examples of how that has happened in the past, etc.