EA’s greatest strength, in my mind, is our epistemic ability—our willingness to weigh the evidence and carefully think through problems. All of the billions of dollars and thousands of people working on the world’s most pressing problems came from that, and we should continue to have that as our top priority.
Thus, I’m not comfortable with sentences like “Proposal: change the framing from “Computers might choose to kill us” to “Humans will use computers to kill us” regardless of whether either potential outcome is more likely than the other.” We shouldn’t be misleading people, including by misrepresenting our beliefs. Plus, remember—if you tell one lie, the truth is forever after your enemy. What if I’m a new EA engaging with AI safety arguments, and you use that argument on me, and I push back? Maybe I say something like “Well, if the problem is that humans will use computers to kill us, why not give the computer enough agency that, if the humans tell it to kill us, the computer tells us to shove it?”
This would obviously be a TERRIBLE idea, but it’s not obvious how you could argue against it within the framework you’ve just constructed where humans are the real danger. Every good argument against this comes from the idea that agentic AI’s are super dangerous, which contrasts the claim you just made. If the danger is humans using these weapons to kill each other, giving the AI’s more agency might be a good idea. If the danger is computers choosing to kill humans, giving the AI’s more agency is a terrible idea. I’m sure you could come up with a way of reconciling these examples, but you’ll notice that it sounds a bit forced, and I bet there are more sophisticated arguments I couldn’t come up with in two minutes that would further separate these two worlds.
We have to be able to think clearly about these problems to solve them, especially AI alignment, which is such a difficult problem to even properly comprehend. I feel like this would be both counterproductive and just not the direction EA should be going. Accuracy is super important—it’s what brought EA from a few people wanting to find the world’s best charities to what we have today.
EA’s greatest strength, in my mind, is our epistemic ability—our willingness to weigh the evidence and carefully think through problems. All of the billions of dollars and thousands of people working on the world’s most pressing problems came from that, and we should continue to have that as our top priority.
Thus, I’m not comfortable with sentences like “Proposal: change the framing from “Computers might choose to kill us” to “Humans will use computers to kill us” regardless of whether either potential outcome is more likely than the other.” We shouldn’t be misleading people, including by misrepresenting our beliefs. Plus, remember—if you tell one lie, the truth is forever after your enemy. What if I’m a new EA engaging with AI safety arguments, and you use that argument on me, and I push back? Maybe I say something like “Well, if the problem is that humans will use computers to kill us, why not give the computer enough agency that, if the humans tell it to kill us, the computer tells us to shove it?”
This would obviously be a TERRIBLE idea, but it’s not obvious how you could argue against it within the framework you’ve just constructed where humans are the real danger. Every good argument against this comes from the idea that agentic AI’s are super dangerous, which contrasts the claim you just made. If the danger is humans using these weapons to kill each other, giving the AI’s more agency might be a good idea. If the danger is computers choosing to kill humans, giving the AI’s more agency is a terrible idea. I’m sure you could come up with a way of reconciling these examples, but you’ll notice that it sounds a bit forced, and I bet there are more sophisticated arguments I couldn’t come up with in two minutes that would further separate these two worlds.
We have to be able to think clearly about these problems to solve them, especially AI alignment, which is such a difficult problem to even properly comprehend. I feel like this would be both counterproductive and just not the direction EA should be going. Accuracy is super important—it’s what brought EA from a few people wanting to find the world’s best charities to what we have today.
I agree with you here.