The Altman you need to distrust & assume bad faith of & need to be paranoid about stealing your power is also usually an Altman who never gave you any power in the first place! I’m still kinda baffled by it, personally.
Two explanations come to my mind:
Past Sam Altman didn’t trust his future self, and wanted to use the OpenAI governance structure to constrain himself.
His status game / reward gradient changed (at least subjectively from his perspective). At the time it was higher status to give EA more power / appear more safety-conscious, and now it’s higher status to take it back / race faster for AGI. (I note there was internal OpenAI discussion about wanting to disassociate with EA after the FTX debacle.)
Both of reasons these probably played some causal role in what happened, but may well have been subconscious considerations. (Also entirely possible that he changed his mind in part for what we’d consider fair reasons.)
So, what could the EA faction of the board have done? …Not much, really. They only ever had the power that Altman gave them in the first place.
Some ideas for what they could have done:
Reasoned about why Altman gave them power in the first place. Maybe come up with hypotheses 1 and 2 above (or others) earlier in the course of events. Try to test these hypotheses when possible and use them to inform decision making.
If they thought 1 was likely, they could have talked to Sam about it explicitly at an early date, asked for more power or failsafes, got more/better experts (at corporate politics) to advise them, monitored Sam more closely, developed preparations/plans for the possible future fight. Asked Sam to publicly talk about how he didn’t trust himself, so that the public would be more sympathetic to the board when the time comes.
If 2 seemed likely, tried to manage Altman’s status (or reward in general) gradient better. For example, gave prominent speeches / op-eds highlighting AI x-risk and OpenAI’s commitment to safety. Asked/forced Sam to frequently do the same thing. Managed risk better so that FTX didn’t happen.
Not back Sam in the first place so they could criticize/constrain him from the outside (e.g. by painting him/OpenAI as insufficiently safety-focused and pushing harder for government regulations). Or made it an explicit and public condition of backing him that EA (including the board members) were allowed to criticize and try to constrain OpenAI, and frequently remind the public of this condition, in part by actually doing this.
Made it OpenAI policy that past and present employees are allowed/encouraged to publicly criticize OpenAI, so that for example the public would be aware of why the previous employee exodus (to Anthropic) happened.
Past Sam Altman didn’t trust his future self, and wanted to use the OpenAI governance structure to constrain himself.
His status game / reward gradient changed (at least subjectively from his perspective). At the time it was higher status to give EA more power / appear more safety-conscious, and now it’s higher status to take it back / race faster for AGI. (I note there was internal OpenAI discussion about wanting to disassociate with EA after the FTX debacle.)
I think it’s reasonable to think that “Constraining Sam in the future” was obviously a highly pareto-efficient deal. EA had every reason to want Sam constrained in the future. Sam had every reason to make that trade, gaining needed power in the short-term, in exchange for more accountability and oversight in the future. This is clearly a sensible trade that actual good guys would make; not “Sam didn’t trust his future self” but rather “Sam had every reason to agree to sell off his future autonomy in exchange for cooperation and trust in the near term”.
I think “the world changing around Sam and EA, rather than Sam or EA changing” is worth more nuance. I think that, over the last 5 years, the world changed to make groups of humans vastly more vulnerable than before, due to new AI capabilities facilitating general-purpose human manipulation and the world’s power players investing in those capabilities. This dramatically increased the risk of outsider third parties creating or exploiting divisions in the AI safety community, to turn people against each other and use the chaos as a ladder. Given that this risk was escalating, then centralizing power was clearly the correct move in response. I’vebeenwarningaboutthis during themonths before the OpenAI conflict started, in thepreceedingweeks (including the concept of an annual discount rate for each person, based on the risk of that person becoming cognitively compromised and weaponized against the AI safety community), and I even described the risk of one of the big tech companies hijacking Anthropic 5days before Sam Altman was dismissed. I think it’s possible that Sam or people in EA also noticed the world rapidly becoming less safe for AI safety orgs, discovering the threat from a different angle than I did.
Thanks, I didn’t know some of this history.
Two explanations come to my mind:
Past Sam Altman didn’t trust his future self, and wanted to use the OpenAI governance structure to constrain himself.
His status game / reward gradient changed (at least subjectively from his perspective). At the time it was higher status to give EA more power / appear more safety-conscious, and now it’s higher status to take it back / race faster for AGI. (I note there was internal OpenAI discussion about wanting to disassociate with EA after the FTX debacle.)
Both of reasons these probably played some causal role in what happened, but may well have been subconscious considerations. (Also entirely possible that he changed his mind in part for what we’d consider fair reasons.)
Some ideas for what they could have done:
Reasoned about why Altman gave them power in the first place. Maybe come up with hypotheses 1 and 2 above (or others) earlier in the course of events. Try to test these hypotheses when possible and use them to inform decision making.
If they thought 1 was likely, they could have talked to Sam about it explicitly at an early date, asked for more power or failsafes, got more/better experts (at corporate politics) to advise them, monitored Sam more closely, developed preparations/plans for the possible future fight. Asked Sam to publicly talk about how he didn’t trust himself, so that the public would be more sympathetic to the board when the time comes.
If 2 seemed likely, tried to manage Altman’s status (or reward in general) gradient better. For example, gave prominent speeches / op-eds highlighting AI x-risk and OpenAI’s commitment to safety. Asked/forced Sam to frequently do the same thing. Managed risk better so that FTX didn’t happen.
Not back Sam in the first place so they could criticize/constrain him from the outside (e.g. by painting him/OpenAI as insufficiently safety-focused and pushing harder for government regulations). Or made it an explicit and public condition of backing him that EA (including the board members) were allowed to criticize and try to constrain OpenAI, and frequently remind the public of this condition, in part by actually doing this.
Made it OpenAI policy that past and present employees are allowed/encouraged to publicly criticize OpenAI, so that for example the public would be aware of why the previous employee exodus (to Anthropic) happened.
I think it’s reasonable to think that “Constraining Sam in the future” was obviously a highly pareto-efficient deal. EA had every reason to want Sam constrained in the future. Sam had every reason to make that trade, gaining needed power in the short-term, in exchange for more accountability and oversight in the future.
This is clearly a sensible trade that actual good guys would make; not “Sam didn’t trust his future self” but rather “Sam had every reason to agree to sell off his future autonomy in exchange for cooperation and trust in the near term”.
I think “the world changing around Sam and EA, rather than Sam or EA changing” is worth more nuance. I think that, over the last 5 years, the world changed to make groups of humans vastly more vulnerable than before, due to new AI capabilities facilitating general-purpose human manipulation and the world’s power players investing in those capabilities.
This dramatically increased the risk of outsider third parties creating or exploiting divisions in the AI safety community, to turn people against each other and use the chaos as a ladder. Given that this risk was escalating, then centralizing power was clearly the correct move in response.
I’ve been warning about this during the months before the OpenAI conflict started, in the preceeding weeks (including the concept of an annual discount rate for each person, based on the risk of that person becoming cognitively compromised and weaponized against the AI safety community), and I even described the risk of one of the big tech companies hijacking Anthropic 5 days before Sam Altman was dismissed. I think it’s possible that Sam or people in EA also noticed the world rapidly becoming less safe for AI safety orgs, discovering the threat from a different angle than I did.