EAs should avoid scrutinizing other community members’ personal career choices unless those individuals ask them for feedback.
I don’t think this is a good norm. I think our career choices matter a lot, this community thrives on having norms and a culture of trying to make the world better, and it seems clear to me that someone’s standing in the EA community should be affected by their career choice, that’s after all where likely the vast majority of their impact on the world will come from.
I think it’s also important to not be a dick about it, and to not pressure people with some kind of implied social consensus. I think it’s good for people to approach other EAs who they see having careers that seem harmful or misguided to them, and then have conversations in which they express their concerns and also are honest about whether they think the person is overall causing harm.
If you don’t know a person well, you don’t have much visibility into what factors they’re considering and how they’re weighing those factors as they choose a job. Therefore, it is not your place to judge them.
I don’t understand the reasoning here. I agree that there can occasionally be good reasons to work in AI capabilities, and it’s hard to tell in-advance for any given individual with very high confidence whether they did not have any good reason for working in the space, but the net-effect of people working on AI capabilities seems clearly extremely negative to me, and if I hear that someone works on AI capabilities I think they are probably causing great harm, and think it makes sense to act on that despite the remaining uncertainty.
There is currently no career that seems to me to be as harmful as working directly on cutting edge AGI capabilities. I will respect you less if you work in this field, and I do honestly don’t want you to be a member in good standing in the EA community if you do this without a very good reason, or contribute majorly in some other way. I might still be interested in having you occasionally contribute intellectually or provide advise in various ways, and I am of course very open to trade of various forms, but I am not interested in you benefitting from the infrastructure, trust and institutions that usually comes from membership in the community.
Indeed I am worried that the EA community overall will be net-negative due to causing many people to work on AI capabilities with flimsy justifications just to stay close to the plot of what will happen with AGI, or just for self-enrichment (or things like earning-to-give, which I would consider even more reprehensible than SBF stealing money from FTX customers and then donating that to EA charities).
Thank you for explaining your position. Like you, I am concerned that organizations like OpenAI and the capabilities race they’ve created have robbed us of the precious time we need to figure out how to make AGI safe. However, I think we’re talking past each other to an extent: importantly, I said that we mostly shouldn’t criticize people for the organizations they work at, not for the roles they play in those organizations.
Most ML engineers have a lot of options of where to work, so choosing to work on AI capabilities research when they have a lot of alternatives outside of AI labs seems morally wrong. (On the other hand, given that AI capabilities teams exist, I’d rather they be staffed by engineers who are concerned about AI safety than engineers who aren’t.) However, I think there are many roles that plausibly advance AI safety that you could only do at an AI lab, such as promoting self-regulation in the AI industry. I’ve also heard arguments that advancing AI safety work sometimes requires advancing AI capabilities first. I think this was more true earlier: GPT-2 taught the AI safety community that they need to focus on aligning large language models. But I am really doubtful that it’s true now.
In general, if someone is doing AI safety technical or governance work at an AI lab that is also doing capabilities research, it is fair game to tell them that you think their approach will be ineffective or that they should consider switching to a role at another organization to avoid causing accidental harm. It is not acceptable to tell them that their choice of where to work means they are “AI capabilities people” who aren’t serious about AI safety. Given that they are working on AI safety, it is likely that they have already weighed the obvious objections to their career choices.
There is also risk of miscommunication: in another interaction I had at another EA-adjacent party, I got lambasted after I told someone that I “work on AI”. I quickly clarified that I don’t work on cutting-edge stuff, but I feel that I shouldn’t have had to do this, especially at a casual event.
Rereading your post, it does make sense now that you were thinking of safety teams at the big labs, but both the title about “selling out” and point #3 about “capabilities people” versus “safety people” made me think you had working on capabilities in mind.
If you think it’s “fair game to tell them that you think their approach will be ineffective or that they should consider switching to a role at another organization to avoid causing accidental harm,” then I’m confused about the framing of the post as being “please don’t criticize EAs who ‘sell out’,” since this seems like “criticizing” to me. It also seems important to sometimes do this even when unsolicited, contra point 2. If the point is to avoid alienating people by making them feel attacked, then I agree, but the norms proposed here go a lot further than that.
Rereading your post, it does make sense now that you were thinking of safety teams at the big labs, but both the title about “selling out” and point #3 about “capabilities people” versus “safety people” made me think you had working on capabilities in mind.
Yes! I realize that “capabilities people” was not a good choice of words. It’s a shorthand based on phrases I’ve heard people use at events.
In general, if someone is doing AI safety technical or governance work at an AI lab that is also doing capabilities research, it is fair game to tell them that you think their approach will be ineffective or that they should consider switching to a role at another organization to avoid causing accidental harm. It is not acceptable to tell them that their choice of where to work means they are “AI capabilities people” who aren’t serious about AI safety. Given that they are working on AI safety, it is likely that they have already weighed the obvious objections to their career choices.
I think this perspective makes more sense than my original understanding of the OP, but I do think it is still misguided. Sadly, it is not very difficult for an organization to just label a job “AI Safety” and then have them work on stuff whose primary aim is to make them more money, in this case by working on things like AI bias, or setting up RLHF pipelines, which might help a bit with some safety, but where the primary result is still billions of additional dollars flowing into AI labs primarily doing scaling-related work.
I sadly do not think that just because someone is working on “AI Safety” that they have weighed and properly considered the obvious objections to their career choices. Indeed, safety-washing seems easy and common, and if you can just hire top EAs by slapping a safety label on a capabilities position, then we will likely make the world worse.
I do react differently to someone working in a safety position, but I do actually have a separate additional negative judgement if I find out that someone is actually working in capabilities but calling their work safety. I think that kind of deception is increasingly happening, and additionally makes coordinating and working in this space harder.
I have a very uninformed view on the relative Alignment and Capabilities contributions of things like RLHF. My intuition is that RLHF is positive for alignment I’m almost entirely uninformed on that. If anyone’s written a summary on where they think these grey-area research areas lie I’d be interested to read it. Scott’s recent post was not a bad entry into the genre but obviously just worked a a very high level.
earning-to-give, which I would consider even more reprehensible than SBF stealing money from FTX customers and then donating that to EA charities
AI capabilities EtG being morally worse than defrauding-to-give sounds like a strong claim.
There exist worlds where AI capabilities work is net positive. I appreciate that you may believe that we’re unlikely to be in one of those worlds (and I’m sure lots of people on this forum agree).
However, given this uncertainty, it seems surprising to see language as strong as “reprehensible” being used.
AI capabilities EtG being morally worse than defrauding-to-give sounds like a strong claim.
I mean, I do think causing all of humanity to go extinct is vastly worse than causing large-scale fraud. I of course think both are deeply reprehensible, but I also think that causing humanity’s extinction is vastly worse and justifies a much stronger response.
Of course, working on capabilities is a much smaller probabilistic increase in humanity’s extinction than SBF’s relatively direct fradulent activities, and I do think this means the average AI capabilities researcher is causing less harm than Sam. But someone founding an organization like OpenAI seems to me to have substantially worse consequences than Sam’s actions (of course, for fraud we often have clearer lines we can draw, and norm enforcement should take into account uncertainty and ambiguity as well as whole host of other considerations, and so I don’t actually support most people to react to someone working in capability labs to make money the same way as they would if they were to hear someone was participating in fraud, though I think both are deserving of a quite strong response).
There exist worlds where AI capabilities work is net positive.
I know very few people who have thought a lot about AI X-Risk who think that capability work marginally speeding up is good.
There was a bunch of disagreement on this topic for the last few years, but I think we are now close enough to AGI that almost everyone I know in the space would wish for more time, and for things to marginally slow down. The people who do still think that marginally speeding up is good exist, and there are arguments remaining, but there are of course also arguments for participating in many other atrocities, and the pure existence of someone of sane mind supporting an endeavor of course should not protect it from criticism and should not serve as a strong excuse to do something anyways.
There were extremely smart and reasonable people supporting the rise of the soviet union and the communist experiment and I of course think those people should be judged extremely negatively in-hindsight given the damage that has caused.
Overall though, I agree with the point that it’s possible to raise questions about someone’s personal career choices without being unpleasant about it. And that doing this in a sensitive way is likely to be net positive
I don’t think this is a good norm. I think our career choices matter a lot, this community thrives on having norms and a culture of trying to make the world better, and it seems clear to me that someone’s standing in the EA community should be affected by their career choice, that’s after all where likely the vast majority of their impact on the world will come from.
I think it’s also important to not be a dick about it, and to not pressure people with some kind of implied social consensus. I think it’s good for people to approach other EAs who they see having careers that seem harmful or misguided to them, and then have conversations in which they express their concerns and also are honest about whether they think the person is overall causing harm.
I don’t understand the reasoning here. I agree that there can occasionally be good reasons to work in AI capabilities, and it’s hard to tell in-advance for any given individual with very high confidence whether they did not have any good reason for working in the space, but the net-effect of people working on AI capabilities seems clearly extremely negative to me, and if I hear that someone works on AI capabilities I think they are probably causing great harm, and think it makes sense to act on that despite the remaining uncertainty.
There is currently no career that seems to me to be as harmful as working directly on cutting edge AGI capabilities. I will respect you less if you work in this field, and I do honestly don’t want you to be a member in good standing in the EA community if you do this without a very good reason, or contribute majorly in some other way. I might still be interested in having you occasionally contribute intellectually or provide advise in various ways, and I am of course very open to trade of various forms, but I am not interested in you benefitting from the infrastructure, trust and institutions that usually comes from membership in the community.
Indeed I am worried that the EA community overall will be net-negative due to causing many people to work on AI capabilities with flimsy justifications just to stay close to the plot of what will happen with AGI, or just for self-enrichment (or things like earning-to-give, which I would consider even more reprehensible than SBF stealing money from FTX customers and then donating that to EA charities).
Thank you for explaining your position. Like you, I am concerned that organizations like OpenAI and the capabilities race they’ve created have robbed us of the precious time we need to figure out how to make AGI safe. However, I think we’re talking past each other to an extent: importantly, I said that we mostly shouldn’t criticize people for the organizations they work at, not for the roles they play in those organizations.
Most ML engineers have a lot of options of where to work, so choosing to work on AI capabilities research when they have a lot of alternatives outside of AI labs seems morally wrong. (On the other hand, given that AI capabilities teams exist, I’d rather they be staffed by engineers who are concerned about AI safety than engineers who aren’t.) However, I think there are many roles that plausibly advance AI safety that you could only do at an AI lab, such as promoting self-regulation in the AI industry. I’ve also heard arguments that advancing AI safety work sometimes requires advancing AI capabilities first. I think this was more true earlier: GPT-2 taught the AI safety community that they need to focus on aligning large language models. But I am really doubtful that it’s true now.
In general, if someone is doing AI safety technical or governance work at an AI lab that is also doing capabilities research, it is fair game to tell them that you think their approach will be ineffective or that they should consider switching to a role at another organization to avoid causing accidental harm. It is not acceptable to tell them that their choice of where to work means they are “AI capabilities people” who aren’t serious about AI safety. Given that they are working on AI safety, it is likely that they have already weighed the obvious objections to their career choices.
There is also risk of miscommunication: in another interaction I had at another EA-adjacent party, I got lambasted after I told someone that I “work on AI”. I quickly clarified that I don’t work on cutting-edge stuff, but I feel that I shouldn’t have had to do this, especially at a casual event.
Rereading your post, it does make sense now that you were thinking of safety teams at the big labs, but both the title about “selling out” and point #3 about “capabilities people” versus “safety people” made me think you had working on capabilities in mind.
If you think it’s “fair game to tell them that you think their approach will be ineffective or that they should consider switching to a role at another organization to avoid causing accidental harm,” then I’m confused about the framing of the post as being “please don’t criticize EAs who ‘sell out’,” since this seems like “criticizing” to me. It also seems important to sometimes do this even when unsolicited, contra point 2. If the point is to avoid alienating people by making them feel attacked, then I agree, but the norms proposed here go a lot further than that.
Yes! I realize that “capabilities people” was not a good choice of words. It’s a shorthand based on phrases I’ve heard people use at events.
I think this perspective makes more sense than my original understanding of the OP, but I do think it is still misguided. Sadly, it is not very difficult for an organization to just label a job “AI Safety” and then have them work on stuff whose primary aim is to make them more money, in this case by working on things like AI bias, or setting up RLHF pipelines, which might help a bit with some safety, but where the primary result is still billions of additional dollars flowing into AI labs primarily doing scaling-related work.
I sadly do not think that just because someone is working on “AI Safety” that they have weighed and properly considered the obvious objections to their career choices. Indeed, safety-washing seems easy and common, and if you can just hire top EAs by slapping a safety label on a capabilities position, then we will likely make the world worse.
I do react differently to someone working in a safety position, but I do actually have a separate additional negative judgement if I find out that someone is actually working in capabilities but calling their work safety. I think that kind of deception is increasingly happening, and additionally makes coordinating and working in this space harder.
I have a very uninformed view on the relative Alignment and Capabilities contributions of things like RLHF. My intuition is that RLHF is positive for alignment I’m almost entirely uninformed on that. If anyone’s written a summary on where they think these grey-area research areas lie I’d be interested to read it. Scott’s recent post was not a bad entry into the genre but obviously just worked a a very high level.
AI capabilities EtG being morally worse than defrauding-to-give sounds like a strong claim.
There exist worlds where AI capabilities work is net positive. I appreciate that you may believe that we’re unlikely to be in one of those worlds (and I’m sure lots of people on this forum agree).
However, given this uncertainty, it seems surprising to see language as strong as “reprehensible” being used.
I mean, I do think causing all of humanity to go extinct is vastly worse than causing large-scale fraud. I of course think both are deeply reprehensible, but I also think that causing humanity’s extinction is vastly worse and justifies a much stronger response.
Of course, working on capabilities is a much smaller probabilistic increase in humanity’s extinction than SBF’s relatively direct fradulent activities, and I do think this means the average AI capabilities researcher is causing less harm than Sam. But someone founding an organization like OpenAI seems to me to have substantially worse consequences than Sam’s actions (of course, for fraud we often have clearer lines we can draw, and norm enforcement should take into account uncertainty and ambiguity as well as whole host of other considerations, and so I don’t actually support most people to react to someone working in capability labs to make money the same way as they would if they were to hear someone was participating in fraud, though I think both are deserving of a quite strong response).
I know very few people who have thought a lot about AI X-Risk who think that capability work marginally speeding up is good.
There was a bunch of disagreement on this topic for the last few years, but I think we are now close enough to AGI that almost everyone I know in the space would wish for more time, and for things to marginally slow down. The people who do still think that marginally speeding up is good exist, and there are arguments remaining, but there are of course also arguments for participating in many other atrocities, and the pure existence of someone of sane mind supporting an endeavor of course should not protect it from criticism and should not serve as a strong excuse to do something anyways.
There were extremely smart and reasonable people supporting the rise of the soviet union and the communist experiment and I of course think those people should be judged extremely negatively in-hindsight given the damage that has caused.
I’ve never seriously entertained the idea that EA is like a sect—until now. This is really uncanny.
Overall though, I agree with the point that it’s possible to raise questions about someone’s personal career choices without being unpleasant about it. And that doing this in a sensitive way is likely to be net positive