We should expect that the incentives and culture for AI-focused companies to make them uniquely terrible for producing safe AGI.
From a “safety from catastrophic risk” perspective, I suspect an “AI-focused company” (e.g. Anthropic, OpenAI, Mistral) is abstractly pretty close to the worst possible organizational structure for getting us towards AGI. I have two distinct but related reasons:
Incentives
Culture
From an incentives perspective, consider realistic alternative organizational structures to “AI-focused company” that nonetheless has enough firepower to host successful multibillion-dollar scientific/engineering projects:
As part of an intergovernmental effort (e.g. CERN’s Large Hadron Collider, the ISS)
As part of a governmental effort of a single country (e.g. Apollo Program, Manhattan Project, China’s Tiangong)
As part of a larger company (e.g. Google DeepMind, Meta AI)
In each of those cases, I claim that there are stronger (though still not ideal) organizational incentives to slow down, pause/stop, or roll back deployment if there is sufficient evidence or reason to believe that further development can result in major catastrophe. In contrast, an AI-focused company has every incentive to go ahead on AI when the case for pausing is uncertain, and minimal incentive to stop or even take things slowly.
From a culture perspective, I claim that without knowing any details of the specific companies, you should expect AI-focused companies to be more likely than plausible contenders to have the following cultural elements:
Ideological AGI Vision AI-focused companies may have a large contingent of “true believers” who are ideologically motivated to make AGI at all costs and
No Pre-existing Safety Culture AI-focused companies may have minimal or no strong “safety” culture where people deeply understand, have experience in, and are motivated by a desire to avoid catastrophic outcomes.
The first one should be self-explanatory. The second one is a bit more complicated, but basically I think it’s hard to have a safety-focused culture just by “wanting it” hard enough in the abstract, or by talking a big game. Instead, institutions (relatively) have more of a safe & robust culture if they have previously suffered the (large) costs of not focusing enough on safety.
For example, engineers who aren’t software engineers understand fairly deep down that their mistakes can kill people, and that their predecessors’ fuck-up have indeed killed people (think bridges collapsing, airplanes falling, medicines not working, etc). Software engineers rarely have such experience.
Similarly, governmental institutions have institutional memories with the problems of major historical fuckups, in a way that new startups very much don’t.
I think there’s a decently-strong argument for there being some cultural benefits from AI-focused companies (or at least AGI-focused ones) – namely, because they are taking the idea of AGI seriously, they’re more likely to understand and take seriously AGI-specific concerns like deceptive misalignment or the sharp left turn. Empirically, I claim this is true – Anthropic and OpenAI, for instance, seem to take these sorts of concerns much more seriously than do, say, Meta AI or (pre-Google DeepMind) Google Brain.
Speculating, perhaps the ideal setup would be if an established organization swallows an AGI-focused effort, like with Google DeepMind (or like if an AGI-focused company was nationalized and put under a government agency that has a strong safety culture).
This is interesting. In my experience with both starting new businesses within larger organizations, and from working in startups, one of the main advantages of startups is exactly that they can have much more relaxed safety/take on much more risk. This is the very reason for the adage “move fast and break things”. In software it is less pronounced but still important—a new fintech product developed within e.g. Oracle will have tons of scrutiny because of many reasons such as reputation but also if it was rolled out embedded in Oracle’s other systems it might cause large-scale damage for the clients. Or, imagine if Bird (the electric scooter company) was an initiative from within Volvo—they absolutely would not have been allowed to be as reckless with their drivers’ safety.
I think you might find examples of this in approaches to AI safety in e.g. OpenAI versus autonomous driving with Volvo.
Not disagreeing with your thesis necessarily, but I disagree that a startup can’t have a safety-focused culture. Most mainstream (i.e., not crypto) financial trading firms started out as a very risk-conscious startup. This can be hard to evaluate from the outside, though, and definitely depends on committed executives.
Regarding the actual companies we have, though, my sense is that OpenAI is not careful and I’m not feeling great about Anthropic either.
I agree that it’s possible for startups to have a safety-focused culture! The question that’s interesting to me is whether it’s likely / what the prior should be.
Finance is a good example of a situation where you often can get a safety culture despite no prior experience with your products (or your predecessor’s products, etc) killing people. I’m not sure why that happened? Some combination of 2008 making people aware of systemic risks + regulations successfully creating a stronger safety culture?
Oh sure, I’ll readily agree that most startups don’t have a safety culture. The part I was disagreeing with was this:
I think it’s hard to have a safety-focused culture just by “wanting it” hard enough in the abstract
Regarding finance, I don’t think this is about 2008, because there are plenty of trading firms that were careful from the outset that were also founded well before the financial crisis. I do think there is a strong selection effect happening, where we don’t really observe the firms that weren’t careful (because they blew up eventually, even if they were lucky in the beginning).
How do careful startups happen? Basically I think it just takes safety-minded founders. That’s why the quote above didn’t seem quite right to me. Why are most startups not safety-minded? Because most founders are not safety-minded, which in turn is probably due in part to a combination of incentives and selection effects.
How do careful startups happen? Basically I think it just takes safety-minded founders.
Thanks! I think this is the crux here. I suspect what you say isn’t enough but it sounds like you have a lot more experience than I do, so happy to (tentatively) defer.
I’m interested in what people think of are the strongest arguments against this view. Here are a few counterarguments that I’m aware of:
1. Empirically the AI-focused scaling labs seem to care quite a lot about safety, and make credible commitments for safety. If anything, they seem to be “ahead of the curve” compared to larger tech companies or governments.
2. Government/intergovernmental agencies, and to a lesser degree larger companies, are bureaucratic and sclerotic and generally less competent.
3. The AGI safety issues that EAs worry about the most are abstract and speculative, so having a “normal” safety culture isn’t as helpful as buying in into the more abstract arguments, which you might expect to be easier to do for newer companies.
4. Scaling labs share “my” values. So AI doom aside, all else equal, you might still want scaling labs to “win” over democratically elected governments/populist control.
Perhaps that the governments are no longer able to get enough funds for such projects (?)
On the competency topic—I got convinced by Mariana Mazzucato in the book Mission Economy, that public sector is suited for such large scale projects, if strong enough motivation is found. She also discusses the financial vs “public good” motivation of private and public sectors in detail.
We should expect that the incentives and culture for AI-focused companies to make them uniquely terrible for producing safe AGI.
From a “safety from catastrophic risk” perspective, I suspect an “AI-focused company” (e.g. Anthropic, OpenAI, Mistral) is abstractly pretty close to the worst possible organizational structure for getting us towards AGI. I have two distinct but related reasons:
Incentives
Culture
From an incentives perspective, consider realistic alternative organizational structures to “AI-focused company” that nonetheless has enough firepower to host successful multibillion-dollar scientific/engineering projects:
As part of an intergovernmental effort (e.g. CERN’s Large Hadron Collider, the ISS)
As part of a governmental effort of a single country (e.g. Apollo Program, Manhattan Project, China’s Tiangong)
As part of a larger company (e.g. Google DeepMind, Meta AI)
In each of those cases, I claim that there are stronger (though still not ideal) organizational incentives to slow down, pause/stop, or roll back deployment if there is sufficient evidence or reason to believe that further development can result in major catastrophe. In contrast, an AI-focused company has every incentive to go ahead on AI when the case for pausing is uncertain, and minimal incentive to stop or even take things slowly.
From a culture perspective, I claim that without knowing any details of the specific companies, you should expect AI-focused companies to be more likely than plausible contenders to have the following cultural elements:
Ideological AGI Vision AI-focused companies may have a large contingent of “true believers” who are ideologically motivated to make AGI at all costs and
No Pre-existing Safety Culture AI-focused companies may have minimal or no strong “safety” culture where people deeply understand, have experience in, and are motivated by a desire to avoid catastrophic outcomes.
The first one should be self-explanatory. The second one is a bit more complicated, but basically I think it’s hard to have a safety-focused culture just by “wanting it” hard enough in the abstract, or by talking a big game. Instead, institutions (relatively) have more of a safe & robust culture if they have previously suffered the (large) costs of not focusing enough on safety.
For example, engineers who aren’t software engineers understand fairly deep down that their mistakes can kill people, and that their predecessors’ fuck-up have indeed killed people (think bridges collapsing, airplanes falling, medicines not working, etc). Software engineers rarely have such experience.
Similarly, governmental institutions have institutional memories with the problems of major historical fuckups, in a way that new startups very much don’t.
I think there’s a decently-strong argument for there being some cultural benefits from AI-focused companies (or at least AGI-focused ones) – namely, because they are taking the idea of AGI seriously, they’re more likely to understand and take seriously AGI-specific concerns like deceptive misalignment or the sharp left turn. Empirically, I claim this is true – Anthropic and OpenAI, for instance, seem to take these sorts of concerns much more seriously than do, say, Meta AI or (pre-Google DeepMind) Google Brain.
Speculating, perhaps the ideal setup would be if an established organization swallows an AGI-focused effort, like with Google DeepMind (or like if an AGI-focused company was nationalized and put under a government agency that has a strong safety culture).
This is interesting. In my experience with both starting new businesses within larger organizations, and from working in startups, one of the main advantages of startups is exactly that they can have much more relaxed safety/take on much more risk. This is the very reason for the adage “move fast and break things”. In software it is less pronounced but still important—a new fintech product developed within e.g. Oracle will have tons of scrutiny because of many reasons such as reputation but also if it was rolled out embedded in Oracle’s other systems it might cause large-scale damage for the clients. Or, imagine if Bird (the electric scooter company) was an initiative from within Volvo—they absolutely would not have been allowed to be as reckless with their drivers’ safety.
I think you might find examples of this in approaches to AI safety in e.g. OpenAI versus autonomous driving with Volvo.
Not disagreeing with your thesis necessarily, but I disagree that a startup can’t have a safety-focused culture. Most mainstream (i.e., not crypto) financial trading firms started out as a very risk-conscious startup. This can be hard to evaluate from the outside, though, and definitely depends on committed executives.
Regarding the actual companies we have, though, my sense is that OpenAI is not careful and I’m not feeling great about Anthropic either.
I agree that it’s possible for startups to have a safety-focused culture! The question that’s interesting to me is whether it’s likely / what the prior should be.
Finance is a good example of a situation where you often can get a safety culture despite no prior experience with your products (or your predecessor’s products, etc) killing people. I’m not sure why that happened? Some combination of 2008 making people aware of systemic risks + regulations successfully creating a stronger safety culture?
Oh sure, I’ll readily agree that most startups don’t have a safety culture. The part I was disagreeing with was this:
Regarding finance, I don’t think this is about 2008, because there are plenty of trading firms that were careful from the outset that were also founded well before the financial crisis. I do think there is a strong selection effect happening, where we don’t really observe the firms that weren’t careful (because they blew up eventually, even if they were lucky in the beginning).
How do careful startups happen? Basically I think it just takes safety-minded founders. That’s why the quote above didn’t seem quite right to me. Why are most startups not safety-minded? Because most founders are not safety-minded, which in turn is probably due in part to a combination of incentives and selection effects.
Thanks! I think this is the crux here. I suspect what you say isn’t enough but it sounds like you have a lot more experience than I do, so happy to (tentatively) defer.
I’m interested in what people think of are the strongest arguments against this view. Here are a few counterarguments that I’m aware of:
1. Empirically the AI-focused scaling labs seem to care quite a lot about safety, and make credible commitments for safety. If anything, they seem to be “ahead of the curve” compared to larger tech companies or governments.
2. Government/intergovernmental agencies, and to a lesser degree larger companies, are bureaucratic and sclerotic and generally less competent.
3. The AGI safety issues that EAs worry about the most are abstract and speculative, so having a “normal” safety culture isn’t as helpful as buying in into the more abstract arguments, which you might expect to be easier to do for newer companies.
4. Scaling labs share “my” values. So AI doom aside, all else equal, you might still want scaling labs to “win” over democratically elected governments/populist control.
Perhaps that the governments are no longer able to get enough funds for such projects (?)
On the competency topic—I got convinced by Mariana Mazzucato in the book Mission Economy, that public sector is suited for such large scale projects, if strong enough motivation is found. She also discusses the financial vs “public good” motivation of private and public sectors in detail.