Iâm a software engineer on the CEA Online team, mostly working on the EA Forum. We are currently interested in working with impactful projects as contractors/âconsultants, please fill in this form if you think you might be a good fit for this.
You can contact me at will.howard@centreforeffectivealtruism.org
I think this table from the paper gives a good idea of the exact methodology:
Like others Iâm not convinced this is a meaningful âred line crossingâ, because non-AI computer viruses have been able to replicate themselves for a long time, and the AI had pre-written scripts it could run to replicate itself.
The reason (made up by me) non-AI computer viruses arenât a major threat to humanity is that:
They are fragile, they canât get around serious attempts to patch the system they are exploiting
They lack the ability to escalate their capabilities once they replicate themselves (a ransomware virus canât also take control of your car)
I donât think this paper shows these AI models making a significant advance on these two things. I.e. if you found this model self-replicating you could still shut it down easily, and this experiment doesnât in itself show the ability of the models to self-improve.