Wanted to make a very small comment on a very small part of this post.
An assistant professor in AI wants to have several PhDs funded. Hearing about the abundance of funding for AI safety research, he drafts a grant proposal arguing why the research topic his group would be working on anyway helps not only with AI capabilities, but also with AI alignment. In the process he convinces himself this is the case, and as a next step convinces some of his students.
Yes, this certainly might be an issue! This particular issue can be mitigated by having funders do lots of grant followups to make sure that differential progress in safety, rather than capabilities, is achieved.
X-Risk Analysis by Dan Hendrycks and Mantas Mazeika provides a good roadmap for doing this. There are also some details in this post (edit since my connection may not have been obvious: I work with Dan and Iām an author of the second post).
Wanted to make a very small comment on a very small part of this post.
Yes, this certainly might be an issue! This particular issue can be mitigated by having funders do lots of grant followups to make sure that differential progress in safety, rather than capabilities, is achieved.
X-Risk Analysis by Dan Hendrycks and Mantas Mazeika provides a good roadmap for doing this. There are also some details in this post (edit since my connection may not have been obvious: I work with Dan and Iām an author of the second post).