EAâs greatest strength, in my mind, is our epistemic abilityâour willingness to weigh the evidence and carefully think through problems. All of the billions of dollars and thousands of people working on the worldâs most pressing problems came from that, and we should continue to have that as our top priority.
Thus, Iâm not comfortable with sentences like âProposal: change the framing from âComputers might choose to kill usâ to âHumans will use computers to kill usâ regardless of whether either potential outcome is more likely than the other.â We shouldnât be misleading people, including by misrepresenting our beliefs. Plus, rememberâif you tell one lie, the truth is forever after your enemy. What if Iâm a new EA engaging with AI safety arguments, and you use that argument on me, and I push back? Maybe I say something like âWell, if the problem is that humans will use computers to kill us, why not give the computer enough agency that, if the humans tell it to kill us, the computer tells us to shove it?â
This would obviously be a TERRIBLE idea, but itâs not obvious how you could argue against it within the framework youâve just constructed where humans are the real danger. Every good argument against this comes from the idea that agentic AIâs are super dangerous, which contrasts the claim you just made. If the danger is humans using these weapons to kill each other, giving the AIâs more agency might be a good idea. If the danger is computers choosing to kill humans, giving the AIâs more agency is a terrible idea. Iâm sure you could come up with a way of reconciling these examples, but youâll notice that it sounds a bit forced, and I bet there are more sophisticated arguments I couldnât come up with in two minutes that would further separate these two worlds.
We have to be able to think clearly about these problems to solve them, especially AI alignment, which is such a difficult problem to even properly comprehend. I feel like this would be both counterproductive and just not the direction EA should be going. Accuracy is super importantâitâs what brought EA from a few people wanting to find the worldâs best charities to what we have today.
EAâs greatest strength, in my mind, is our epistemic abilityâour willingness to weigh the evidence and carefully think through problems. All of the billions of dollars and thousands of people working on the worldâs most pressing problems came from that, and we should continue to have that as our top priority.
Thus, Iâm not comfortable with sentences like âProposal: change the framing from âComputers might choose to kill usâ to âHumans will use computers to kill usâ regardless of whether either potential outcome is more likely than the other.â We shouldnât be misleading people, including by misrepresenting our beliefs. Plus, rememberâif you tell one lie, the truth is forever after your enemy. What if Iâm a new EA engaging with AI safety arguments, and you use that argument on me, and I push back? Maybe I say something like âWell, if the problem is that humans will use computers to kill us, why not give the computer enough agency that, if the humans tell it to kill us, the computer tells us to shove it?â
This would obviously be a TERRIBLE idea, but itâs not obvious how you could argue against it within the framework youâve just constructed where humans are the real danger. Every good argument against this comes from the idea that agentic AIâs are super dangerous, which contrasts the claim you just made. If the danger is humans using these weapons to kill each other, giving the AIâs more agency might be a good idea. If the danger is computers choosing to kill humans, giving the AIâs more agency is a terrible idea. Iâm sure you could come up with a way of reconciling these examples, but youâll notice that it sounds a bit forced, and I bet there are more sophisticated arguments I couldnât come up with in two minutes that would further separate these two worlds.
We have to be able to think clearly about these problems to solve them, especially AI alignment, which is such a difficult problem to even properly comprehend. I feel like this would be both counterproductive and just not the direction EA should be going. Accuracy is super importantâitâs what brought EA from a few people wanting to find the worldâs best charities to what we have today.
I agree with you here.