The benefits and risks of optimism (about AI safety)
This is a reaction to Nora Belrose’s and Quintin Pope’s AI Optimism initiative. However, others are better qualified to criticize the specific arguments they give for their claim that AI is easy to control. Instead, I will focus on the general stance of optimism, when it can be beneficial and when it may be delusional and dangerous.
The benefits of optimism
I have been an optimist all my life. This has led me into many dead-ends, but it has also been the basis for all the successes I achieved. For example, I wanted to write a novel since I was a teenager. I finally sat down to do it when I was 43 and the start-up I had founded three years ago was not doing very well. I sent my first effort, a children’s book, to 20 publishers and got 20 rejections. I wrote a second and third novel which no one wanted to publish (self-publishing wasn’t really an option back then). My fourth novel finally was accepted and became an instant bestseller. I have published almost 70 books since then, and I’m not intending to stop anytime soon.
I define “optimism” as the tendency to weigh positive outcomes higher than their expected value and negative outcomes lower. The probability of my fourth novel becoming a success after three disappointments didn’t seem very high, but I didn’t even think about that. I was optimistic that I had learned something and it was worth the try anyway.
A true Bayesian can never be an optimist (nor a pessimist, which would be the opposite). So optimism must be stupid, right?
Not necessarily. Optimism has obvious and less obvious benefits. One obvious benefit is that it feels better to be optimistic. Optimistic people are also more fun to hang around with, so it’s easier for them to make and maintain social connections. Optimism can even become a self-fulfilling prophecy: if you believe in your own success, others tend to believe in you too and will be more willing to help you or fund your efforts.
Our human nature obviously favors optimism and even sometimes recklessness, so there must be an evolutionary advantage behind it. And there is: optimism is a driver for growth and learning, and this makes it easier to adapt to changing circumstances. Imagine you have two populations, one consisting only of “realists” who always make correct decisions based on expected values, while another consists of optimists who will take risks even though the expected value is negative. The realists will have a higher survival rate, but the optimists will spread farther and be able to adapt to more different circumstances. It takes a lot of optimism to cross a steep mountain range to find out what’s in the valley on the far side, or to set sail for an unknown continent. So, after a while, there will likely be a larger population of the optimists.
The guy who won the lottery is almost certainly an optimist, at least regarding his chances of winning the lottery. Most successful company founders are optimists. Scientists who explore new directions even though their peers tell them that this is hopeless are optimists in a way, too. Optimism, the belief that you can achieve the seemingly impossible, gives you the energy and motivation to try out new things. Arguably, optimism is the driver behind all technological progress.
The risks of optimism
However, being an optimist is risky. I have published close to 70 books and written some more, but I had many setbacks before and many of my books haven’t sold very well. I have founded four companies, none of which became very successful. I have tried out many things, from music to board games to computer games to videos to developing Alexa skills, which all failed miserably. My best guess is that I have spent about 80% of my time on failed efforts (including boring and unfulfilling jobs that led nowhere) and only 20% on successful ones. Still, I don’t regret these failed experiments. I learned from them, and often, as in the case of my writing, failures turned out to be necessary or at least helpful steps towards success.
However, I obviously never risked my life for any of these failed efforts. I didn’t even risk my financial future when I founded my start-ups. I gave up secure jobs and did lose money, but I didn’t take large debts. I always made sure that my personal downside risk was limited because I knew that I might fail and I had a family to care for. Writing a book is not a very big investment. I could do it in my spare time besides my regular job. Rejections hurt, but they don’t kill you. Writing is even fun, so the cost of writing my fourth novel was almost zero and the downside risk of failure was that this effort was once again largely wasted (apart from what I have learned in the process).
On the other hand, there are optimists who pay for their optimism with their lives, from explorers who got killed by boldly going where no one has gone before to overconfident soldiers and scientists. The disaster of the American withdrawal from Afghanistan, for instance, which led to an unexpectedly swift takeover by the Taliban, may have been due to an overly optimistic assessment of the situation. The Darwin Award winners were almost certainly optimists (besides being obviously stupid).
Being optimistic about whether the random unknown mushroom you picked up in the woods will be healthy to eat is a bad idea. Optimism is not appropriate when it comes to airplane safety, IT security, or dangerous biological experiments.
In short: optimism can be beneficial when the downside is limited and the upside is large, even when the probability of success is so low that the expected value is negative. Optimism is bad when it is the other way round. Which brings us to AI safety.
Why being generally optimistic about AI safety is bad
Claiming that “AI is easy to control” when heavyweights like Geoffrey Hinton, Yoshua Bengio and many others have a different opinion can be seen as a quite optimistic stance. It speaks for Nora Belrose and Quintin Pope that they openly admit this and even call their initiative “AI optimism”.
As I have pointed out, there are some things to be said in favor of optimism. This is true even for AI safety. Being optimistic gives you the necessary energy and motivation to try out things which you otherwise might not try. I personally have been much more optimistic about my own ability to influence Germany towards acknowledging existential risks from AI two years ago than I am today, and I find it increasingly difficult to get up and even try to do anything about it. A bit of optimism could possibly help me do more than I am actually doing right now, and in theory could lead to a success against all odds.
In this sense, I am supportive of optimism, for example about trying out specific new approaches in AI safety, like mechanistic interpretability. If the downside is just the time and effort you spend on a particular AI safety approach and the potential (if unlikely) upside is that you solve alignment and save the world, then please forget about the actual success probability and be optimistic about it (unless you have an even better idea that is more likely to succeed)!
However, being optimistic about our ability to control superintelligent AI and/or solve the alignment problem in time so we can just race full speed ahead towards developing AGI is an entirely different matter. The upside in this case is some large financial return, mostly going to people who are already insanely rich, and maybe some benefits to humanity in general (which I think could mostly also be achieved with less risky methods). The downside is destroying the future of humanity. Being optimistic in such a situation is a very bad idea.
An additional problem here is that optimism is contagious. Politicians like to be optimistic because voters like it too. This may be part of the explanation why it is still very unpopular in Germany to talk about AI existential risks, why our government thinks it is a good idea to exclude foundation models from the AI act, and why people concerned about AI safety are called “doomers”, “neo-luddites” or even “useful idiots”. People want to be optimistic, so they are looking for confirmation and positive signals. And if some well-respected AI researchers found an organization called “AI Optimism”, this will certainly increase overall optimism, even if it is largely met with skepticism in the AI safety community.
As I have pointed out, optimism is dangerous when the downside is very large or even unlimited. Therefore, I think general “AI Optimism” is a bad idea. This is largely independent of the detailed discussion about how hard controlling AI actually is. As long as they cannot prove that they have solved the control problem or AGI alignment, “AI Optimism” certainly diminishes my personal hope for our future.
I don’t think this is the definition that Pope & Belrose are using. I think they are using it in the same sense as “I’m optimistic about my relationship”: A genuine belief that something will go well.
I think they claim to be optimistic because they genuinely believe that the development of AI will have good effects and that significant harms are unlikely, and they want policies such as open sourcing to reflect that.
I don’t think that there’s a huge difference. As long as there aren’t very strong fact-based arguments for exactly how likely it is that we will be able to control AGI, my definition of “optimists” will end up with a significantly higher probability of things going well. From what I read, I believe that Belrose and Pope have this basic bias towards “AGI is beneficial” and weigh the upside potential higher than the downside risks. They then present arguments in favor of that position. This is of course just an impression, I can’t prove it. In any case, even if they genuinely believe that everything they say is correct, they still should put in massive caveats and point out where exactly their arguments are weak or could be questioned. But that is not what a self-declared “optimist” does. So, instead they just present their beliefs. That’s okay, but it is clearly a sign of optimism the way I define it.
This seems like a selective demand. I believe that doomers have a bias towards “AGI is destructive”. Will you comment on doomer posts, demanding they add in massive caveats and point out exactly where their arguments are weak?
If you don’t agree with Pope and Belrose, argue with them on the facts. Don’t argue with disingenuous semantic games, and pretend that the word “optimist” doesn’t have more than one definition in the dictionary.
I agree that some “doomers” (you may count me as one) are “pessimistic”, being biased towards a negative outcome. I can’t rule out that I’m “overly cautious”. However, I’d argue that this is net positive for AI safety on the same grounds that I think optimism as I defined it is net positive under different circumstances, as described.
I agree that the word “optimism” can be used in different ways, that’s why I gave a definition of the way I usually use it. My post was a reaction to Pope and Belrose, but as I stated, not about their arguments but generally about being “optimistic” in the way I defined it. Nora Belrose said in a comment on LessWrong that my way of defining optimism is not how they meant it, and as long as I don’t analyze their texts, I have to accept that. But I think my definition of optimism fits in the range of common uses of the word (see Wikipedia, for example). All I did was trying to point out that this kind of “positive outcome bias” may be beneficial under certain circumstances, but not for thinking about AI safety.
I believe that if Pope and Belrose try to have a truly rational and unbiased stance, the term “AI Optimism” is at least misleading, as it can be understood in the way I have understood it. I hope this post is at least helpful in the sense that I have pointed that possible misunderstanding out.
Executive summary: The author argues that optimism can drive progress but becomes dangerous when downside risks are high, as with optimistic assumptions about AI safety.
Key points:
Optimism motivates exploration and fuels adaptation, though often inefficiently. It likely evolved to promote growth despite negative expected outcomes.
However, optimism has downsides like wasted effort and becomes clearly problematic when potential harms are severe or irreversible.
AI safety intrinsically risks catastrophic and existential harms if control fails, so optimism seems unwarranted and dangerous absent strong arguments it is achievable.
More limited optimism about trying particular approaches to AI safety may still be warranted based on upside, even if success chances are low.
But general optimism about controlling any AGI system developed risks trivializing crucial safety issues and diminishing chances of averting disaster.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.