I didn’t say “blanket ban on ML,” I said “a blanket ban on very large models.”
I know? I never said you asked for a blanket ban on ML?
Because I have not seen, and don’t think anyone can make, a clear criterion for “too high risk of doom,”
My post discusses “pause once an agent passes 10 or more of the ARC Evals tasks”. I think this is too weak a criterion and I’d argue for a harder test, but I think this is already better than a blanket ban on very large models.
ARC Evals is pushing responsible scaling, which is a conditional pause proposal.
But also, the blanket ban on very large models is implicitly saying that “more powerful than GPT-4” is the criterion of “too high risk of doom”, so I really don’t understand at all where you’re coming from.
I know? I never said you asked for a blanket ban on ML?
My post discusses “pause once an agent passes 10 or more of the ARC Evals tasks”. I think this is too weak a criterion and I’d argue for a harder test, but I think this is already better than a blanket ban on very large models.
Anthropic just committed to a conditional pause.
ARC Evals is pushing responsible scaling, which is a conditional pause proposal.
But also, the blanket ban on very large models is implicitly saying that “more powerful than GPT-4” is the criterion of “too high risk of doom”, so I really don’t understand at all where you’re coming from.