Do you disagree or were we just understanding the claim differently?
I disagree, assuming we are operating under the assumption that GPT-5 means “increase above GPT-4 relative to the increase GPT-4 was above GPT-3” (which I think is what you are getting at in the paper?), rather than what the thing that will actually be called GPT-5 will be like. And it has an “o-series style” reasoning model built on top of it, and whatever other scaffolding needed to make it agentic (computer use etc).
“a notably incompetent or poorly-prepared society learns lots of new unknown unknowns all at once”
I think that is, unfortunately, where we are heading!
“It [ensuring that we get helpful superintelligence earlier in time] increases takeover risk(!)”
Emphasis here on the “helpful”
I think the problem is the word “ensuring”, when there’s no way we can ensure it. The result is increasing risk when people take this as a green light to go faster and bring forward the time where we take the (most likely fatal) gamble on ASI.
“We need at least 13 9s of safety for ASI, and the best current alignment techniques aren’t even getting 3 9s...”
Can you elaborate on this? How are we measuring the reliability of current alignment techniques here?
I’m going by published results where various techniques are reported, and show things like 80% reduction in harmful outputs, 90% reduction in deception, 99% reduction in jailbreaks etc.
Is this good or bad, on your view? Seems more stabilising than a regime which favours AI malfunction “first strikes”?
Yeah. Although an international non-proliferation treaty would be far better. Perhaps MAIM might prompt this though?
but perhaps we should have emphasised more that pausing is an option.
Yes!
But most “if-then” policies I am imagining are not squarely focused on avoiding AI takeover
I disagree, assuming we are operating under the assumption that GPT-5 means “increase above GPT-4 relative to the increase GPT-4 was above GPT-3” (which I think is what you are getting at in the paper?), rather than what the thing that will actually be called GPT-5 will be like. And it has an “o-series style” reasoning model built on top of it, and whatever other scaffolding needed to make it agentic (computer use etc).
I think that is, unfortunately, where we are heading!
I think the problem is the word “ensuring”, when there’s no way we can ensure it. The result is increasing risk when people take this as a green light to go faster and bring forward the time where we take the (most likely fatal) gamble on ASI.
I’m going by published results where various techniques are reported, and show things like 80% reduction in harmful outputs, 90% reduction in deception, 99% reduction in jailbreaks etc.
Yeah. Although an international non-proliferation treaty would be far better. Perhaps MAIM might prompt this though?
Yes!
They should be! We need strict red lines in the evals program[1].
See replies in the other thread. Thanks again for engaging!
That are short of things like “found in the wild escaped from the lab”(!)