The fellowship will cover what we currently consider to be the most important sources of s-risk (TAI conflict, risks from malevolent actors).
Any reason CLR believes that to be the case specifically? For instance, it’s argued on this page that botched alignment attempts/partially aligned AIs (near miss) & unforeseen instrumental drives of an unaligned AI are the 2 likeliest AGI-related s-risks, with malevolent actors (deliberately suffering-aligned AI) currently a lesser concern. I guess TAI conflict could fall under the second category, as an instrumental goal derived risk.
Thanks for asking — you can read more about these two sources of s-risk in Section 3.2 of our new intro to s-risks article. (We also discuss “near miss” there, but our current best guess is that such scenarios are significantly less likely than other s-risks of comparable scale.)
Any reason CLR believes that to be the case specifically? For instance, it’s argued on this page that botched alignment attempts/partially aligned AIs (near miss) & unforeseen instrumental drives of an unaligned AI are the 2 likeliest AGI-related s-risks, with malevolent actors (deliberately suffering-aligned AI) currently a lesser concern. I guess TAI conflict could fall under the second category, as an instrumental goal derived risk.
Thanks for asking — you can read more about these two sources of s-risk in Section 3.2 of our new intro to s-risks article. (We also discuss “near miss” there, but our current best guess is that such scenarios are significantly less likely than other s-risks of comparable scale.)