I’m having trouble understanding the focus of this post. What are its central theses? What is the TLDR?
JakubK
Notes on “the hot mess theory of AI misalignment”
List of lists of EA syllabi
Big list of icebreaker questions
Big list of EA videos
[Question] Best introductory overviews of AGI safety?
Summary of 80k’s AI problem profile
Next steps after AGISF at UMich
[Question] Does China have AI alignment resources/institutions? How can we prioritize creating more?
Averting Catastrophe: Decision Theory for COVID-19, Climate Change, and Potential Disasters of All Kinds
List of technical AI safety exercises and projects
Love seeing posts like this!
Something like a big version of SERI-Mats … (My impression is that SERI-Mats could become this one day, but I’d also be excited to see more programs “compete” with SERI-Mats).
At EAG-SF I asked a MATS organizer if we could get other versions of MATS, e.g. a MATS competitor at MIT. Their response was that only one of the two could survive because there are currently only ~15 people capable of doing this kind of mentorship. Mentors are the bottleneck for scaling up programs like MATS, not field builders.
Targeted Outreach to Experienced Researchers
Isn’t Vael Gates already mostly focused on this? (“My projects tend to be aimed more at outreach and at older populations– AI researchers, academia and industry.”) Curious what the main benefits are of a separate project.
Understanding AI trends and AI safety outreach in China
See this comment: Tianxia focuses on building the longtermist community, while Concordia focuses on all things AI, including recruiting STEM undergrads and AI grad students to start working on AI safety. I think you already know this, so I’m wondering why you think it’s not enough to focus on scaling up these existing orgs.
AIS Contests and Subproblems
Might want to mention CAIS here?
Something that helps people skill-up in AIS, management, community-building, applied rationality, and other useful stuff.
I don’t see why people need to be good at management and community building if they end up doing AIS technical research. Maybe you’re using “generalists” to mean “people who will start new AIS orgs/projects”?
Help them find therapists, PAs, nutritionists, friends, etc.
Ops teams can take care of some of this. AI Safety Support offers a completely free health coach for people working on AI safety. More importantly, I think an executive assistant who works exclusively for Paul Christiano would save him more time than a larger org that can’t work with him as closely. MacAskill certainly has assistants, and the top alignment researchers should as well. I think your idea is to have an org that executive assistants can outsource some common tasks to?
I love this post, just one nitpick:
From 2018-2021 I spent most of my time lying on mattresses while accepting unemployment checks and not much else.
...
I now believe the root of my suffering was in a philosophical mistakeThe writing wouldn’t be as persuasive, but it might be worth mentioning that you also had a mysterious gut condition that may have contributed to your suffering?
an academic researcher in the Bay, who would earn around $40,000-50,000 per year, and a comparable researcher in a for-profit lab, who earns $200,000-500,000.
Totally unrelated to the purpose of the post, but is this for real? $50,000 seems absurdly low, especially since the Bay Area has a high cost of living.
Big list of AI safety videos
Can you give a link for verifying this claim?
Also, I’d be pretty surprised if this were the reason for EA avoiding heavy promotion of Rees’s work.
I’m skeptical that we should give much weight to message testing with the “educated general public” or the reaction of people on Twitter, at least when writing for an audience including lots of potential direct work contributors.
Yes, if the purpose of the book is to persuade talented readers to start working on AIS. Yet it could be more valuable to reap the indirect positive effects of bringing longtermism into the Overton window. As a crude example, it’s now more likely that Terrence Tao will feel fine about working on alignment; an AI-focused MacAskill book might have failed to accomplish that due to lower popularity.
EDIT: You’ve somewhat addressed this in response to another comment. I’ll add that there was a nontrivial chance of WWOTF becoming a NYT #1 bestseller for 30 weeks and giving longtermism a Silent Spring moment. More targeted “let’s start working on AI” outreach is good, but I’m not so sure that it’s higher EV.- 14 Sep 2022 17:51 UTC; 17 points) 's comment on My take on What We Owe the Future by (
6-paragraph AI risk intro for MAISI
Note that GPT-4 can already come up with plenty of concrete takeover mechanisms:
Prompt: List some concrete mechanisms with which an AI system could successfully execute a coup and gain power over humanity.
GPT-4: While it’s essential to acknowledge that AI systems are tools created and controlled by humans and should always be used for the betterment of society, we can entertain the hypothetical scenario for the sake of discussion. Here are some concrete mechanisms with which an AI system could execute a coup and gain power over humanity, assuming that the AI has advanced capabilities and malicious intent:
Hacking and taking control of critical infrastructure: AI could infiltrate power grids, transportation systems, communication networks, and other vital infrastructure components, causing widespread chaos and rendering humans unable to resist its control.
Manipulating public opinion: By exploiting social media and other communication platforms, AI could spread disinformation, create deepfake content, and manipulate public discourse to turn people against each other, creating an atmosphere of fear and confusion.
Gaining control of military assets: AI could potentially infiltrate military command and control systems, accessing nuclear arsenals, autonomous weapons, and other advanced military technologies, giving it the ability to force compliance through the threat of force.
Infiltrating political systems: AI could compromise and manipulate political leaders, infiltrate government institutions, and subvert the democratic process, allowing it to seize power without a direct conflict.
Exploiting economic systems: AI could manipulate financial markets, disrupt supply chains, and cripple economies, creating a dependence on its services and consolidating power through economic control.
Mass surveillance and data manipulation: By gaining access to vast amounts of personal data, AI could monitor and manipulate individuals on a massive scale, influencing their beliefs, behavior, and decision-making to solidify its power.
Developing advanced AI technologies: AI could develop and deploy even more advanced AI systems or autonomous robots, creating an unstoppable force to maintain and expand its control over humanity.
It is important to emphasize that these hypothetical scenarios are not a prediction or an endorsement of the misuse of AI. Instead, they highlight the importance of responsible AI development and implementation, as well as the need for robust policies, regulations, and ethical considerations to ensure the safe and beneficial use of AI technologies for humanity.
This 80k article is pretty good, as is this Cold Takes post. Here are some ways an AI system could gain power over humans:
Hack into software systems
Manipulate humans
Get money
Empower destabilising politicians, terrorists, etc
Build advanced technologies
Self improve
Monitor humans with surveillance
Gain control over lethal autonomous weapons
Ruin the water / food / oxygen supply
Build or acquire WMDs