Remmelt comments on Why I think it’s net harmful to do technical safety research at AGI labs

Remmelt 7 Feb 2024 16:49 UTC
1 point
0 ∶ 1
Further, I think that there are a bunch of arguments for the value of safety work within labs (e.g. access to sota models; building institutional capacity and learning; cultural outreach) which seem to me to be significant and you’re not engaging with.

Let’s dig into the arguments you mentioned then.
- Access to SOTA models
  - Given that safety research is intractable where open-ended and increasingly automated systems are scaled anywhere near current rates, I don’t really see the value proposition here.
  - I guess if researchers noticed a bunch of bad design practices and violations of the law in inspecting the SOTA models, they could leak information about that to the public?
- Building institutional capacity and learning
  - Inside a corporation competing against other corporations, where the more power-hungry individuals tend to find ways to the top, the institutional capacity-building and learning you will see will be directed towards extracting more profit and power.
  - I think this argument considered within its proper institutional context actually cuts against your current conclusion.
- Cultural outreach
  - This reminds me of the cultural exchanges between US and Soviet scientists during the Cold War. Are you thinking of something like that?
  - Saying that, I notice that the current situation is different in the sense that AI Safety researchers are not one side racing to scale proliferation of dangerous machines in tandem with the other side (AGI labs).
  - To the extent though that AI Safety researchers can come to share collectively important insights with colleagues at AGI labs – such as on why and how to stop scaling dangerous machine technology, this cuts against my conclusion.
  - Looking from the outside, I haven’t seen that yet. Early AGI safety thinkers (eg. Yudkowsky, Tegmark) and later funders (eg. Tallinn, Karnofsky) instead supported AGI labs to start up, even if they did not mean to.
  - But I’m open (and hoping!) to change my mind.
    It would be great if safety researchers at AGI labs start connecting to collaborate effectively on restricting harmful scaling.
I’m going off the brief descriptions you gave.
Does that cover the arguments as you meant them? What did I miss?