“I think EA’s failure to grapple with the corrupting influence of power is among its greatest failures. “
This has been the feature of forum discussions that has disturbed me possibly the most since joining. People don’t like to put any weight on conflicts of interest even when the person arguing a point has a huge amount to gain. “Just argue the object point” people say, don’t bring up the conflict of interest…
People seem surprised and bewildered when AI folks defect away from AI safety towards capabilities. People trust that as AI companies grow, those gaining power and money from shares will not be adversely influenced by that power and money.
Even as I have gained a teeny weeny bit of power just in a teeny weeny corner of the global health world, I have felt a little of the corrupting influence. Living far away from this in Uganda, I’m not part of this at all and like you it’s very unclear what can be done to help, but talking about it a bit could be better than nothing. I loved this post thank you!
People seem surprised and bewildered when AI folks defect away from AI safety towards capabilities. People trust that as AI companies grow, those gaining power and money from shares will not be adversely influenced by that power and money.
fwiw I don’t actually know many examples of this, and the ones I hear cited often seem uncompelling to me. E.g.:
Mechanize’s founders don’t seem like EAs who got corrupted by AI money but rather EAs with unusual moral and empirical views which result in them thinking that the best course of action is the exact opposite of what most EAs think
Hmm, I think if smart EA/Rat types get “corrupted” in general, they’ll present as thoughtful people with reasons that are hard to dismiss quickly when questioned by EAs. I get the vague sense that your evidence bar for “corruption” is going to be too high to be useful in most worlds where there’s a lot of corruption.
(that’s not to say that EAs/Rats/etc. who join labs/start wildly profitable companies speeding up AI progress have been “corrupted”—I just think if they were, it would present pretty similarly to how it has done and it’s hard to get lots of easy to share evidence)
I don’t disagree, it’s more that this feels a bit like privileging the hypothesis? I think the modal reason I’ve heard from people who did capabilities work and now regret it is something like “I knew I was misaligned with leadership but I thought leaving would be even worse.”
If, for some reason, Anthropic asked me how to prevent people from regretting working for them, I would focus much more on “have a thing for people to do once they realize their colleague is corrupt” instead of “have a more nuanced way of telling if their colleague is corrupt.”
Thanks! I only know a handful of people in this category, but for what it’s worth, it again feels like people who were predisposed to thinking that working on pretraining would be okay rather than them being “corrupted.”
E.g., I recently talked to someone who told me that their main takeaway from a safety fellowship was realizing that they didn’t fit in because they actually weren’t worried about existential risk in the same way that the other attendees were.
In the AI frame I remember reading about 3 situations on the forum (one of which was mechanise). I also consider this to a lesser extent around animal sentience arguments from those deep in the animal welfare world.
the most pertinent example for me would be Anthropic’s top leadership ditching their solid safety plan with clear red lines for a vague and practically useless one, and the justifications by @Holden Karnofsky (who’s wife owns the company) which felt strange to me. He usually makes such compelling arguments, and that one seemed less so. I’m not the most rational person, but Habryka’s arguments against the safety plan change on less wrong were compelling to me.
I’m not saying we shouldn’t argue the object point, but just that we should consider people’s incentives and weight the opinions of those with power/money conflicts of interest somewhat less heavily than those without.
I also consider this to a lesser extent around animal sentience arguments
+1, “it would be very easy for me to ignore the possibility that nematodes might be conscious” is a major impediment to thinking clearly about animal sentience (including for me).
Dude it’s basically the whole of Anthropic! And the fact that EAs (mostly) can’t see this is worrying. OP worries about Anthropic’s money being a corrupting influence, but their whole company is far far worse than FTX, because of the existential risk it’s subjecting the entire world to.
I suppose it’s a reaction to the tendency on the political left to not listen to a person at all because of some association they have with some group.
But I agree with you. We should be wary of these dynamics, without falling into black and white ways of thinking.
Also to second Nick, I really ‘felt’ and resonated with this post!
Maybe we are reading different folks though. Do you have specific examples of you making conflict-of-interest arguments and folks on the forum pushing back on you to instead argue the object-level-point?
“I think EA’s failure to grapple with the corrupting influence of power is among its greatest failures. “
This has been the feature of forum discussions that has disturbed me possibly the most since joining. People don’t like to put any weight on conflicts of interest even when the person arguing a point has a huge amount to gain. “Just argue the object point” people say, don’t bring up the conflict of interest…
People seem surprised and bewildered when AI folks defect away from AI safety towards capabilities. People trust that as AI companies grow, those gaining power and money from shares will not be adversely influenced by that power and money.
Even as I have gained a teeny weeny bit of power just in a teeny weeny corner of the global health world, I have felt a little of the corrupting influence. Living far away from this in Uganda, I’m not part of this at all and like you it’s very unclear what can be done to help, but talking about it a bit could be better than nothing. I loved this post thank you!
fwiw I don’t actually know many examples of this, and the ones I hear cited often seem uncompelling to me. E.g.:
Greg Brockman doesn’t seem like a true believer in OpenAI’s nonprofit mission who got corrupted but rather someone who went into it wanting to make a profit
Mechanize’s founders don’t seem like EAs who got corrupted by AI money but rather EAs with unusual moral and empirical views which result in them thinking that the best course of action is the exact opposite of what most EAs think
(Counterexamples appreciated, though!)
Hmm, I think if smart EA/Rat types get “corrupted” in general, they’ll present as thoughtful people with reasons that are hard to dismiss quickly when questioned by EAs. I get the vague sense that your evidence bar for “corruption” is going to be too high to be useful in most worlds where there’s a lot of corruption.
(that’s not to say that EAs/Rats/etc. who join labs/start wildly profitable companies speeding up AI progress have been “corrupted”—I just think if they were, it would present pretty similarly to how it has done and it’s hard to get lots of easy to share evidence)
I don’t disagree, it’s more that this feels a bit like privileging the hypothesis? I think the modal reason I’ve heard from people who did capabilities work and now regret it is something like “I knew I was misaligned with leadership but I thought leaving would be even worse.”
If, for some reason, Anthropic asked me how to prevent people from regretting working for them, I would focus much more on “have a thing for people to do once they realize their colleague is corrupt” instead of “have a more nuanced way of telling if their colleague is corrupt.”
I think he would include a lot of people who work at Anthropic, for example, on pre-training, some of whom went through MATS or something.
Thanks! I only know a handful of people in this category, but for what it’s worth, it again feels like people who were predisposed to thinking that working on pretraining would be okay rather than them being “corrupted.”
E.g., I recently talked to someone who told me that their main takeaway from a safety fellowship was realizing that they didn’t fit in because they actually weren’t worried about existential risk in the same way that the other attendees were.
In the AI frame I remember reading about 3 situations on the forum (one of which was mechanise). I also consider this to a lesser extent around animal sentience arguments from those deep in the animal welfare world.
the most pertinent example for me would be Anthropic’s top leadership ditching their solid safety plan with clear red lines for a vague and practically useless one, and the justifications by @Holden Karnofsky (who’s wife owns the company) which felt strange to me. He usually makes such compelling arguments, and that one seemed less so. I’m not the most rational person, but Habryka’s arguments against the safety plan change on less wrong were compelling to me.
I’m not saying we shouldn’t argue the object point, but just that we should consider people’s incentives and weight the opinions of those with power/money conflicts of interest somewhat less heavily than those without.
+1, “it would be very easy for me to ignore the possibility that nematodes might be conscious” is a major impediment to thinking clearly about animal sentience (including for me).
Dude it’s basically the whole of Anthropic! And the fact that EAs (mostly) can’t see this is worrying. OP worries about Anthropic’s money being a corrupting influence, but their whole company is far far worse than FTX, because of the existential risk it’s subjecting the entire world to.
I suppose it’s a reaction to the tendency on the political left to not listen to a person at all because of some association they have with some group.
But I agree with you. We should be wary of these dynamics, without falling into black and white ways of thinking.
Also to second Nick, I really ‘felt’ and resonated with this post!
Arguing the object point is useful, and I love to see it done when possible.
Sometimes it is also useful to call out who is making the argument.
I see the argument that AI folks go from safety to capabilities made constantly (ie, every discussion of OpenAI’s origin). It seems correct but neither novel nor controversial in EA/rat spaces. EX: Habyka’s last point on: https://www.lesswrong.com/posts/MqgwHJ93pJpaeHXs6/posts-i-don-t-have-time-to-write
Maybe we are reading different folks though. Do you have specific examples of you making conflict-of-interest arguments and folks on the forum pushing back on you to instead argue the object-level-point?