Fantastic summary, Nicholas, Andrew, and Robert. I’m looking forward to reading the paper.
A few quick thoughts on the summary:
It’s reassuring to hear that information hazards are unlikely for lower values of the decisiveness parameter. One relevant follow-up question is how might AI developers form an opinion on what value the decisiveness parameter takes? Is this something we can hope to influence?
It’s not quite as reassuring to hear that framing AI Safety as a group effort might discourage safety investments due to moral hazard. I do find your proposal to share safety knowledge with the leader to be promising. We might also want policymakers to have some way to ensure that those sharing this safety knowledge were well compensated. Doing so might give a preemptive motive for companies to invest in safety, as they might be able to sell it to the leader if they fall behind in the race.
I really like that you caution against updating only on the basis of a model alone. It encourages me to think about how we might empirically test these claims concerning moral hazard and decisiveness.
Good question! I’ll need to think more about this, but my initial impression is that regular surveying of developers about AI progress could help by quantifying their level of uncertainty over the arrival rate of particular milestones, which is likely correlated with how they believe expected capabilities investments map onto progress.
That seems right, though it likely depends upon how substitutable safety research is across firms.
Fantastic summary, Nicholas, Andrew, and Robert. I’m looking forward to reading the paper.
A few quick thoughts on the summary:
It’s reassuring to hear that information hazards are unlikely for lower values of the decisiveness parameter. One relevant follow-up question is how might AI developers form an opinion on what value the decisiveness parameter takes? Is this something we can hope to influence?
It’s not quite as reassuring to hear that framing AI Safety as a group effort might discourage safety investments due to moral hazard. I do find your proposal to share safety knowledge with the leader to be promising. We might also want policymakers to have some way to ensure that those sharing this safety knowledge were well compensated. Doing so might give a preemptive motive for companies to invest in safety, as they might be able to sell it to the leader if they fall behind in the race.
I really like that you caution against updating only on the basis of a model alone. It encourages me to think about how we might empirically test these claims concerning moral hazard and decisiveness.
Thanks for your thoughts!
Good question! I’ll need to think more about this, but my initial impression is that regular surveying of developers about AI progress could help by quantifying their level of uncertainty over the arrival rate of particular milestones, which is likely correlated with how they believe expected capabilities investments map onto progress.
That seems right, though it likely depends upon how substitutable safety research is across firms.