Hi I’m William and I am new to the Effective Altruism community.
William comes from a country in the Pacific called New Zealand. He was educated at the University of Otago where he received a first class honours degree in chemistry. He is currently traveling through Europe to learn more about different cultures and ideas.
Hi there everyone, I’m William the Kiwi and this is my first post on EA forums. I have recently discovered AI alignment and have been reading about it for around a month. This seems like an important but terrifyingly under invested in field. I have many questions but in the interest of speed I will involve Cunningham’s Law and post my current conclusions.
My AI conclusions:
Corrigiblity is mathematically impossible for AGI.
Alignment requires defining all important human values in a robust enough way that it can survive near-infinite amounts of optimisation pressure exerted by a superintelligent AGI. Alignment is therefore difficult.
Superintelligence by Nick Bostrum is a way of communicating the antimeme “unaligned AI is dangerous” to the general public.
The extinction of humanity is a plausible outcome of unaligned AI.
Eliezer Yudkowsky seems overly pessimistic but likely correct about most things he says.
Humanity is likely to produce AGI before it produces fully aligned AI.
To incentivize responses to this post I should offer a £1000 reward for a response that supports or refutes each of these conclusions and provides evidence for it.
I am currently visiting England and would love to talk more about this topic with people, either over the Internet or in person.