There are other safety problems—often ones that are more speculative—that the market is not incentivizing companies to solve.
My personal response would be as follows:
As Leopold presents it, the key pressure here that keeps labs in check is societal constraints on deployment, not perceived ability to make money. The hope is that society’s response has the following properties:
thoughtful, prominent experts are attuned to these risks and demand rigorous responses
policymakers are attuned to (thoughtful) expert opinion
policy levers exist that provide policymakers with oversight / leverage over labs
If labs are sufficiently thoughtful, they’ll notice that deploying models is in fact bad for them! Can’t make profit if you’re dead. *taps forehead knowingly*
but in practice I agree that lots of people are motivated by the tastiness of progress, pro-progress vibes, etc., and will not notice the skulls.
Counterpoints to 1:
Good regulation of deployment is hard (though not impossible in my view).
reasonable policy responses are difficult to steer towards
attempts at raising awareness of AI risk could lead to policymakers getting too excited about the promise of AI while ignoring the risks
experts will differ; policymakers might not listen to the right experts
Good regulation of development is much harder, and will eventually be necessary.
This is the really tricky one IMO. I think it requires pretty far-reaching regulations that would be difficult to get passed today, and would probably misfire a lot. But doesn’t seem impossible, and I know people are working on laying groundwork for this in various ways (e.g. pushing for labs to incorporate evals in their development process).
Like Akash, I agree with a lot of the object-level points here and disagree with some of the framing / vibes. I’m not sure I can articulate the framing concerns I have, but I do want to say I appreciate you articulating the following points:
Society is waking up to AI risks, and will likely push for a bunch of restrictions on AI progress
Sydney and the ARC Captcha example have made AI safety stuff more salient.
There’s opportunity for substantially more worry about AI risk to emerge after even mild warning events (e.g. AI-powered cyber events, crazier behavior emerging during evals)
Society’s response will be dumb and inefficient in a lot of ways, but could also end up getting pointed in some good directions
The more an org’s AI development / deployment abilities are constrained by safety considerations (whether their own concerns or other stakeholders’), the more safety looks like just another thing you need in order to deploy your powerful AI systems, so that safety work becomes a complement to capabilities work.