I certainly think that having an academic discipline devoted to AI safety is an option, but I think it’s a bad idea for other reasons; if safety is viewed as separate from ML in general, you end up in a situation similar to cybersecurity, where everyone builds dangerous shit, and then the cyber people recoil in horror, and hopefully barely patch the most obvious problems.
That said, yes, I’m completely fine with having informal networks of people working on a goal—it exists regardless of efforts. But a centralized effort at EA community building in general is a different thing, and as I argued here, I tentatively think this are bad, at least at the margin.
I agree with you insofar as separating AI safety from ML is terrible, since the objective of AI safety, in the end, is not to only study safety but to actually implement it in ML systems, and that can only be done in close communication with the general ML community (and I really enjoyed your analogy with cybersecurity).
I don’t know what is the actual current state of this communication, nor who is working on improving it (although I know people are discussing it), but a thing I want to see at least are alignment papers published in NeurIPS, ICML, JMLR, and so on. My two-cent guess is that this would be easier if AI safety would be more dissociated with EA or even longtermism, although I could easily envision myself being wrong.
EDIT: One point important to clarify is that “more dissociated” does not mean “fully dissociated” here. It may be as well that EA donors support AI safety research, effective altruism as an idea makes people look into AI safety, and so on. My worry is AI safety being seen by a lot of people as “that weird idea coming from EA/rationalist folks”. No matter how fair this view actually is, the point is that AI safety should be popular, non-controversial, if safety techniques are to be adopted en masse (which is the end goal).
I certainly think that having an academic discipline devoted to AI safety is an option, but I think it’s a bad idea for other reasons; if safety is viewed as separate from ML in general, you end up in a situation similar to cybersecurity, where everyone builds dangerous shit, and then the cyber people recoil in horror, and hopefully barely patch the most obvious problems.
That said, yes, I’m completely fine with having informal networks of people working on a goal—it exists regardless of efforts. But a centralized effort at EA community building in general is a different thing, and as I argued here, I tentatively think this are bad, at least at the margin.
I agree with you insofar as separating AI safety from ML is terrible, since the objective of AI safety, in the end, is not to only study safety but to actually implement it in ML systems, and that can only be done in close communication with the general ML community (and I really enjoyed your analogy with cybersecurity).
I don’t know what is the actual current state of this communication, nor who is working on improving it (although I know people are discussing it), but a thing I want to see at least are alignment papers published in NeurIPS, ICML, JMLR, and so on. My two-cent guess is that this would be easier if AI safety would be more dissociated with EA or even longtermism, although I could easily envision myself being wrong.
EDIT: One point important to clarify is that “more dissociated” does not mean “fully dissociated” here. It may be as well that EA donors support AI safety research, effective altruism as an idea makes people look into AI safety, and so on. My worry is AI safety being seen by a lot of people as “that weird idea coming from EA/rationalist folks”. No matter how fair this view actually is, the point is that AI safety should be popular, non-controversial, if safety techniques are to be adopted en masse (which is the end goal).