In terms of public interviews, I think the most interesting/relevant parts are him expressing willingness to bite consequentialist/utilitarian bullets in a way that’s a bit on the edge of the mainstream Overton window, but I believe would’ve been within the EA Overton window prior to recent events (unsure about now). BTW I got these examples from Marginal Revolution comments/Twitter.
This one seems most relevant—the first question Patrick asks Sam is whether the ends justify the means.
In this interview, search for “So why then should we ever spend a whole lot of money on life extension since we can just replace people pretty cheaply?” and “Should a Benthamite be risk-neutral with regard to social welfare?”
In any case, given that you think people should put hardly any weight on your assessment, it seems to me that as a community we should be doing a fair amount of introspection. Here are some things I’ve been thinking about:
We should update away from “EA exceptionalism” and towards self-doubt. (EDIT: I like this thread about “EA exceptionalism”, though I don’t agree with all the claims.) It sounds like you think more self-doubt would’ve been really helpful for Sam. IMO, self-doubt should increase in proportion to one’s power. (Trying to “more than cancel out” the normal human tendency towards decreased self-doubt as power increases.) This one is tricky, because it seems bad to tell people who already experience Chidi Anagonye-style crippling self-doubt that they should self-doubt even more. But it certainly seems good for our average level of self-doubt to increase, even if self-doubt need not increase in every individual EA. Related: Having the self-awareness to know where you are on the self-doubt spectrum seems like an important and unsolved problem.
I’m also wondering if I should think of “morality” as being two different things: A descriptive account of what I value, and (separately) a prescriptive code of behavior. And then, beyond just endorsing the abstract concept of ethical injunctions, maybe it would be good to take a stab at codifying exactly what they should be. The idea seems a bit under-operationalized, although it’s likely there are relevant blog posts that aren’t coming to my mind. Like, I notice that the EA who’s most associated with the phrase “ethical injunctions” is also the biggest advocate of drastic unilateral action, and I’m not sure how to reconcile that (not trying to throw shade—genuinely unsure). EDIT: This is a great tweet; related.
Institutional safeguards are also looking better, but I was already very in favor of those and puzzled by lack of EA interest, so I can’t say it was a huge update for me personally.
This one is tricky, because it seems bad to tell people who already experience Chidi Anagonye-style crippling self-doubt that they should self-doubt even more.
EA self-doubt has always seemed weirdly compartmentalized to me. Even the humblest of people in the movement is often happy to dismiss considered viewpoints by highly intelligent people on the grounds that it doesn’t satisfy EA principles. This includes me—I think we are sometimes right to do so, but probably do so far too much nonetheless.
(from phone)
That was an example of an ea being highly upvoted for dismissing multiple extremely smart and well meaning people’s life’s work as ‘really flimsy and incredibly speculative’ because he wasn’t satisfied that they could justify their work within a framework that the ea movement had decided is one of the only ones worth contemplating. As if that framework itself isn’t incredibly speculative (and therefore if you reject any of its many suppositions, really flimsy)
I’m not sure I share your view of that post. Some quotes from it:
...he just believed it was really important for humanity to make space settlements in order for it to survive long-term… From what I could tell, [my professor] probably spend less than 10 hours seriously figuring out if space settlements would actually be more valuable to humanity than other alternatives.
...
Take SpaceX, Blue Origin, Neurolink, OpenAI. Each of these started with a really flimsy and incredibly speculative moral case. Now, each is probably worth at least $10 Billion, some much more. They all have very large groups of brilliant engineers and scientists. They all don’t seem to have researchers really analyzing the missions to make sure they actually make sense.
...
My impression is that Andrew Carnegie spent very little, if anything, to figure out if libraries were really the best use of his money, before going ahead and funding 3,000 libraries.
...
I rarely see political groups seriously red-teaming their own policies, before they sign them into law, after which the impacts can last for hundreds of years.
I don’t think any of these observations hinge on the EA framework strongly? Like, do we have reason to believe Andrew Carnegie spent a significant amount trying to figure out if libraries were a great donation target by his own lights, as opposed to according to the EA framework?
The thing that annoyed me about that post was that at the time it was written, it seemed to me that the EA movement was also fairly guilty of this! (It was written before the criticism/red teaming contest.)
I’m not familiar enough with the case of Andrew Carnegie to comment and I agree on the point of political tribalism. The other two are what bother me.
On the professor, the problem is there explicitly: you omitted a key line ‘I tried asking for his opinion on existential threats’, which is a strongly EA-identifying approach, and one which many people feel is too simplistic. Eg see Gideon Futurman’s EAGx Rotterdam talk when it’s up—he argues the way EAs think about x-risk is far too simplified, focusing on single-event narratives, ignoring countless possible trajectories that could end in extinction or similar any one of which is vanishingly unlikely, but which collectively we should take much more seriously. Whether or not one agrees with this view, it seems to me to be one a smart person could reasonably hold, and shows that by asking someone ‘his opinion on existential threats, and which specific scenarios these space settlements would help with’, you’re pigeonholing them into EA-aligned specific-single-event way of thinking.
As for Elon Musk, I think the same problem is there implicitly: he’s written a paper called ‘Making Humans a Multiplanetary Species’, spoken extensively on the subject and spent his life thinking that it’s important, and while you could reasonably disagree with his arguments, I don’t see any grounds for dismissing them as ‘really flimsy and incredibly speculative’ without engagement, unless your reason for doing so is ‘there exists a pool of important research which contradicts them and which I think is correct’. There are certainly plenty of other smart people who think as he does, some of them EAs (though maybe that doesn’t contribute to my original complaint). Since there’s a very clear mathematical argument that it’s harder to kill all of a more widespread and numerous civilisation, to say that the case is ‘really flimsy’, you basically need to assume the EA-aligned narrative that AI is highly likely to kill us all.
What’s interesting about this interview clip though is that he seems to explicitly endorse a set of principles that directly contradict the actions he took!
Well that’s the thing—it seems likely he didn’t see his actions as contradicting those principles. Suggesting that they’re actually a dangerous set of principles to endorse, even if they sound reasonable. That’s what’s really got me thinking.
I wonder if part of the problem is a consistent failure of imagination on the part of humans to see how our designs might fail. Kind of like how an amateur chess player devotes a lot more thought to how they could win than how their opponent could win. So if the principles Sam endorsed are at all recoverable, maybe they could be recovered via a process like “before violating common-sense ethics for the sake of utility, go down a massive checklist searching for reasons why this could be a mistake, including external observers in the decision if possible”.
I think your first paragraph provides a potential answer to your second :-)
There’s an implicit “Sam fell prey to motivated reasoning, but I wouldn’t do that” in your comment, which itself seems like motivated reasoning :-)
(At least, it seems like motivated reasoning in the absence of a strong story for Sam being different from the rest of us. That’s why I’m so interested in what people like nbouscal have to say.)
So you think there’s too much danger of cutting yourself and everyone else via motivated reasoning, ala Dan Luu’s “Normalization of Deviance” and the principles have little room for errors in implementing them, is that right?
most human beings perceive themselves as good and decent people, such that they can understand many of their rule violations as entirely rational and ethically acceptable responses to problematic situations. They understand themselves to be doing nothing wrong, and will be outraged and often fiercely defend themselves when confronted with evidence to the contrary.
Specifically, I was saying that wrong results would come up if you failed in one of the steps of reasoning, and there’s no self-correction mechanism for bad reasoning like Sam Bankman-Fried was doing.
Thanks for the reply!
In terms of public interviews, I think the most interesting/relevant parts are him expressing willingness to bite consequentialist/utilitarian bullets in a way that’s a bit on the edge of the mainstream Overton window, but I believe would’ve been within the EA Overton window prior to recent events (unsure about now). BTW I got these examples from Marginal Revolution comments/Twitter.
This one seems most relevant—the first question Patrick asks Sam is whether the ends justify the means.
In this interview, search for “So why then should we ever spend a whole lot of money on life extension since we can just replace people pretty cheaply?” and “Should a Benthamite be risk-neutral with regard to social welfare?”
In any case, given that you think people should put hardly any weight on your assessment, it seems to me that as a community we should be doing a fair amount of introspection. Here are some things I’ve been thinking about:
We should update away from “EA exceptionalism” and towards self-doubt. (EDIT: I like this thread about “EA exceptionalism”, though I don’t agree with all the claims.) It sounds like you think more self-doubt would’ve been really helpful for Sam. IMO, self-doubt should increase in proportion to one’s power. (Trying to “more than cancel out” the normal human tendency towards decreased self-doubt as power increases.) This one is tricky, because it seems bad to tell people who already experience Chidi Anagonye-style crippling self-doubt that they should self-doubt even more. But it certainly seems good for our average level of self-doubt to increase, even if self-doubt need not increase in every individual EA. Related: Having the self-awareness to know where you are on the self-doubt spectrum seems like an important and unsolved problem.
I’m also wondering if I should think of “morality” as being two different things: A descriptive account of what I value, and (separately) a prescriptive code of behavior. And then, beyond just endorsing the abstract concept of ethical injunctions, maybe it would be good to take a stab at codifying exactly what they should be. The idea seems a bit under-operationalized, although it’s likely there are relevant blog posts that aren’t coming to my mind. Like, I notice that the EA who’s most associated with the phrase “ethical injunctions” is also the biggest advocate of drastic unilateral action, and I’m not sure how to reconcile that (not trying to throw shade—genuinely unsure). EDIT: This is a great tweet; related.
Institutional safeguards are also looking better, but I was already very in favor of those and puzzled by lack of EA interest, so I can’t say it was a huge update for me personally.
EA self-doubt has always seemed weirdly compartmentalized to me. Even the humblest of people in the movement is often happy to dismiss considered viewpoints by highly intelligent people on the grounds that it doesn’t satisfy EA principles. This includes me—I think we are sometimes right to do so, but probably do so far too much nonetheless.
Seems plausible, I think it would be good to have a dedicated “translator” who tries to understand & steelman views that are less mainstream in EA.
Wasn’t sure about the relevance of that link?
(from phone) That was an example of an ea being highly upvoted for dismissing multiple extremely smart and well meaning people’s life’s work as ‘really flimsy and incredibly speculative’ because he wasn’t satisfied that they could justify their work within a framework that the ea movement had decided is one of the only ones worth contemplating. As if that framework itself isn’t incredibly speculative (and therefore if you reject any of its many suppositions, really flimsy)
Thanks!
I’m not sure I share your view of that post. Some quotes from it:
...
...
...
I don’t think any of these observations hinge on the EA framework strongly? Like, do we have reason to believe Andrew Carnegie spent a significant amount trying to figure out if libraries were a great donation target by his own lights, as opposed to according to the EA framework?
The thing that annoyed me about that post was that at the time it was written, it seemed to me that the EA movement was also fairly guilty of this! (It was written before the criticism/red teaming contest.)
I’m not familiar enough with the case of Andrew Carnegie to comment and I agree on the point of political tribalism. The other two are what bother me.
On the professor, the problem is there explicitly: you omitted a key line ‘I tried asking for his opinion on existential threats’, which is a strongly EA-identifying approach, and one which many people feel is too simplistic. Eg see Gideon Futurman’s EAGx Rotterdam talk when it’s up—he argues the way EAs think about x-risk is far too simplified, focusing on single-event narratives, ignoring countless possible trajectories that could end in extinction or similar any one of which is vanishingly unlikely, but which collectively we should take much more seriously. Whether or not one agrees with this view, it seems to me to be one a smart person could reasonably hold, and shows that by asking someone ‘his opinion on existential threats, and which specific scenarios these space settlements would help with’, you’re pigeonholing them into EA-aligned specific-single-event way of thinking.
As for Elon Musk, I think the same problem is there implicitly: he’s written a paper called ‘Making Humans a Multiplanetary Species’, spoken extensively on the subject and spent his life thinking that it’s important, and while you could reasonably disagree with his arguments, I don’t see any grounds for dismissing them as ‘really flimsy and incredibly speculative’ without engagement, unless your reason for doing so is ‘there exists a pool of important research which contradicts them and which I think is correct’. There are certainly plenty of other smart people who think as he does, some of them EAs (though maybe that doesn’t contribute to my original complaint). Since there’s a very clear mathematical argument that it’s harder to kill all of a more widespread and numerous civilisation, to say that the case is ‘really flimsy’, you basically need to assume the EA-aligned narrative that AI is highly likely to kill us all.
Thanks!
What’s interesting about this interview clip though is that he seems to explicitly endorse a set of principles that directly contradict the actions he took!
Well that’s the thing—it seems likely he didn’t see his actions as contradicting those principles. Suggesting that they’re actually a dangerous set of principles to endorse, even if they sound reasonable. That’s what’s really got me thinking.
I wonder if part of the problem is a consistent failure of imagination on the part of humans to see how our designs might fail. Kind of like how an amateur chess player devotes a lot more thought to how they could win than how their opponent could win. So if the principles Sam endorsed are at all recoverable, maybe they could be recovered via a process like “before violating common-sense ethics for the sake of utility, go down a massive checklist searching for reasons why this could be a mistake, including external observers in the decision if possible”.
My guess is standard motivated reasoning explains why he thought he wasn’t in violation of his stated principles.
Question, but why do you think the principles were dangerous, exactly? I am confused about the danger you state.
I think your first paragraph provides a potential answer to your second :-)
There’s an implicit “Sam fell prey to motivated reasoning, but I wouldn’t do that” in your comment, which itself seems like motivated reasoning :-)
(At least, it seems like motivated reasoning in the absence of a strong story for Sam being different from the rest of us. That’s why I’m so interested in what people like nbouscal have to say.)
So you think there’s too much danger of cutting yourself and everyone else via motivated reasoning, ala Dan Luu’s “Normalization of Deviance” and the principles have little room for errors in implementing them, is that right?
Here’s a link to it:
https://danluu.com/wat/
And a quote:
I’m not sure what you mean by “the principles have little room for errors in implementing them”.
That quote seems scarily plausible.
EDIT: Relevant Twitter thread
Specifically, I was saying that wrong results would come up if you failed in one of the steps of reasoning, and there’s no self-correction mechanism for bad reasoning like Sam Bankman-Fried was doing.