Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
If you haven’t already, I’d recommend reading Richard Ngo’s AGI Safety From First Principles, which I think is an unusually rigorous treatment of the issue.
I had it bookmarked, but not looked at it yet. Thanks for the recommendation!
Also check out the AGI Safety Fundamentals Alignment Curriculum and corresponding Google doc. The Intro to ML Safety material might also be of interest.
Thanks post this post! Seeing how many global challenges are in a sense alignment problems also brought me on board with understanding AI Safety. Climate change and social media are good touchstones for what I think of as social/political alignment issues.
I don’t know if this is exactly correct (so someone help me if I’m off base) but I find the AI alignment issue especially mentally complex to wrap my head around because it doesn’t seem like we have good solutions yet at almost any level of technical or social/political alignment. Here’s how I think of them in my head:
technical alignment: can we have an inconceivably smart optimizing machine follow what we really want it to do in order to benefit us, vs taking the letter of its programming down paths that would be bad. Can we look into the black box to know what the heck is going on, so that we can stop it if needed.
AND
social/political alignment: Can we as humans create and uphold fair and effective rules of regulation on power that are effective in a globalized economy without a strong world government. Can we design laws and social norms that prevent catastrophe when more and more people and businesses have access to access to increasingly powerful machines that do what they are asked (blow people up if you want them to with enormous accuracy) and have unintended side effects (influencing elections through social media algorithms).
With AI we don’t have either. It is sort of as if runaway climate change were happening and we didn’t yet understand that CO2 was part of the root cause or something.
The fact that a lot of x-risk issues share common threads in the social-political alignment sphere to me is interesting, and is one of my main arguments for why EA-ers should pay more attention to climate change. It seems to share some of the global game-theory elements to other issues like pandemics and AI regulation, and work on x-risks as a whole may be stronger if there is a lot of cross-pollination of strategies and leanings, ESPECIALLY because climate change is less neglected and has had some amount of progress in recent decades.
yeah! It definitely seems like AI alignment is difficult in the two aspects you say, technical and social, whereas something like climate change is mainly difficult from a social perspective. I feel like getting social media right is something that we don’t actually know how to solve technically either, so maybe this is another motivation for trying to use it as a test case.
Overall, the realisation of the scale of the challenge of just the social aspect is what has really got my attention.
The corporate alignment problem does precede the AI alignment problem. In some sense we rather deliberately misaligned them by giving them a single goal, relying on human agency and motivation embedded within the system to keep them from running amok. But as they became more sophisticated and competed with each other this became rather unreliable and we have instead tried to restrain and incentivize them with regulation, which has also not been entirely satisfactory.
Steinbeck was prescient (or just a keen observer):
“It happens that every man in a bank hates what the bank does, and yet the bank does it. The bank is something more than men, I tell you. It’s the monster. Men made it, but they can’t control it.”
Unfortunately the gap between politically feasible solutions and ones that seem likely to actually be effective is pretty large in this area.
yeah totally, and I don’t see how these problems aren’t going to directly translate into problems with AI alignment, as the most likely places to first deploy AGI are going to be corporations