“What AI safety research agendas could be massively sped up by AI agents? What properties do they have (e.g. easily checkable, engineering > conceptual …)?”
I’ll strongly consider putting out a post with a detailed breakdown and notes on when we think it’ll be possible. We’re starting to run experiments that will hopefully inform things as well.
Do you have a list of research questions that you think could easily be sped up with AI systems? I suspect that I’m more pessimistic than you are due to concerns around scheming AI agents doing intentional research sabotage and think that the affordances of AI agents might make some currently intractable agendas more tractable.
We’re working on providing clarity on:
“What AI safety research agendas could be massively sped up by AI agents? What properties do they have (e.g. easily checkable, engineering > conceptual …)?”
I’ll strongly consider putting out a post with a detailed breakdown and notes on when we think it’ll be possible. We’re starting to run experiments that will hopefully inform things as well.
Do you have a list of research questions that you think could easily be sped up with AI systems? I suspect that I’m more pessimistic than you are due to concerns around scheming AI agents doing intentional research sabotage and think that the affordances of AI agents might make some currently intractable agendas more tractable.