My impression is that itās easy to contribute to āun-nuanced and inaccurateā discourse or hype about artificial intelligence while talking about AI safety. Personally, Iām interested in doing AI safety research, so I need to be able to explain the motivation for my work to people who may be unfamiliar with the field. How do you explain AI safety accurately and without hyping it up too much?
While I havenāt read the book, Slate Star Codex has a great review on Human Compatible. Scott says it speaks of AI safety, especially in the long-term future, in a very professional sounding, and not weird way. So I suggest reading that book, or that review.
You could also list several different smaller scale AI-misalignment problems, such as the problems surrounding Zuckerberg and Facebook. You could say something like āYou know how Facebookās AI is programmed to keep you on as long as possible, so often it will show you controversial content in order to rile you up, and get everyone yelling on everyone else so you never leave the platform? Yeah, I make sure that wonāt happen with smarter, and more influential AIs.ā If all youāre going for is an elevator speech, or explaining to family what is it you do, Iād stop here. Otherwise, say something like āBy my estimation, this seems fairly important, as incentives are aligned for companies and countries to use the best AI possible, and better AI means more influential AI, so if you have a really good, but slightly sociopathic AI, itās likely itāll still be used anyway. And if, in a few decades, we get to the point where we have a smarter than human, but still sociopathic AI, itās possible weāve just made an immortal Hitler-Einstein combination. Which, needless to say, would be very bad, possibly even extinction-level bad. So if the job is very hard, and the result if the job doesnāt get done is very bad, then the job is very very important (thatās very2).ā after the first part.
Iāve never tried using these statements, but the seem like theyād work.
Was going to recommend this as well (and I have read the book).
This isnāt a complete answer, but I think it is useful to have a list of prosaic alignment failures to make the basic issue more concrete. Examples include fairness (bad data leading to inferences that reflect bad values), recommendation systems going awry, etc. I think Catherine Olsson has a long list of these, but I donāt know where it is. We should generically effect some sort of amplification as AI strength increases; itās conceivable the amplification is in the good direction, but at a minimum we shouldnāt be confident of that.
If someone is skeptical about AIs getting smart enough that this matters, you can point to the various examples of existing superhuman systems (game playing programs, dog distinguishers that beat experts, medical imaging systems that beat teams of experts, etc.). Narrow superintelligence should already be enough to worry, depending on how such systems are deployed.
note: your link is broken
Fixed, thanks!