The fellowship will cover what we currently consider to be the most important sources of s-risk (TAI conflict, risks from malevolent actors).
Any reason CLR believes that to be the case specifically? For instance, it’s argued on this page that botched alignment attempts/partially aligned AIs (near miss) & unforeseen instrumental drives of an unaligned AI are the 2 likeliest AGI-related s-risks, with malevolent actors (deliberately suffering-aligned AI) currently a lesser concern. I guess TAI conflict could fall under the second category, as an instrumental goal derived risk.
Just want to signal boost the subreddit for s-risk discussion.