Post summary (feel free to suggest edits!): Rob paraphrases Nate’s thoughts on capabilities work and the landscape of AGI organisations. Nate thinks:
Capabilities work is a bad idea, because it isn’t needed for alignment to progress and it could speed up timelines. We already have many ML systems to study, which our understanding lags behind. Publishing that work is even worse.
He appreciates OpenAI’s charter, openness to talk to EAs / rationalists, clearer alignment effort than FAIR or Google Brain, and transparency about their plans. He considers DeepMind and Anthropic on par and slightly ahead respectively on taking alignment seriously.
OpenAI, Anthropic, and DeepMind are unusually safety-conscious AI capabilities orgs (e.g., much better than FAIR or Google Brain). But reality doesn’t grade on a curve, there’s still a lot to improve, and they should still call a halt to mainstream SotA-advancing potentially-AGI-relevant ML work, since the timeline-shortening harms currently outweigh the benefits.
(If you’d like to see more summaries of top EA and LW forum posts, check out the Weekly Summaries series.)
Thanks, Zoe! This is great. :) Points 1 and 2 of your summary are spot-on.
Point 3 is a bit too compressed:
Even if an organisation does well for the reference class “AI capabilities org”, it’s better for it to stop, and others not to join that class, because positive effects are outweighed by any shortening of AGI timelines. This applies to all of OpenAI, DeepMind, FAIR, Google Brain etc.
“This applies to all of OpenAI, DeepMind, FAIR, Google Brain” makes it sound like “this organization does well for the reference class ‘AI capabilities org’” applies to all four orgs; whereas actually we think OpenAI, DeepMind, and Anthropic are doing well for that class, and FAIR and Google Brain are not.
“Even if an organisation does well for the reference class ‘AI capabilities org’, it’s better for it to stop” also makes it sound like Nate endorses this as true for all possible capabilities orgs in all contexts. Rather, Nate thinks it could be good to do capabilities work in some contexts; it just isn’t good right now. The intended point is more like:
OpenAI, Anthropic, and DeepMind are unusually safety-conscious AI capabilities orgs (e.g., much better than FAIR or Google Brain). But reality doesn’t grade on a curve, there’s still a lot to improve, and they should still call a halt to mainstream SotA-advancing potentially-AGI-relevant ML work, since the timeline-shortening harms currently outweigh the benefits.
Post summary (feel free to suggest edits!):
Rob paraphrases Nate’s thoughts on capabilities work and the landscape of AGI organisations. Nate thinks:
Capabilities work is a bad idea, because it isn’t needed for alignment to progress and it could speed up timelines. We already have many ML systems to study, which our understanding lags behind. Publishing that work is even worse.
He appreciates OpenAI’s charter, openness to talk to EAs / rationalists, clearer alignment effort than FAIR or Google Brain, and transparency about their plans. He considers DeepMind and Anthropic on par and slightly ahead respectively on taking alignment seriously.
OpenAI, Anthropic, and DeepMind are unusually safety-conscious AI capabilities orgs (e.g., much better than FAIR or Google Brain). But reality doesn’t grade on a curve, there’s still a lot to improve, and they should still call a halt to mainstream SotA-advancing potentially-AGI-relevant ML work, since the timeline-shortening harms currently outweigh the benefits.
(If you’d like to see more summaries of top EA and LW forum posts, check out the Weekly Summaries series.)
Thanks, Zoe! This is great. :) Points 1 and 2 of your summary are spot-on.
Point 3 is a bit too compressed:
“This applies to all of OpenAI, DeepMind, FAIR, Google Brain” makes it sound like “this organization does well for the reference class ‘AI capabilities org’” applies to all four orgs; whereas actually we think OpenAI, DeepMind, and Anthropic are doing well for that class, and FAIR and Google Brain are not.
“Even if an organisation does well for the reference class ‘AI capabilities org’, it’s better for it to stop” also makes it sound like Nate endorses this as true for all possible capabilities orgs in all contexts. Rather, Nate thinks it could be good to do capabilities work in some contexts; it just isn’t good right now. The intended point is more like:
OpenAI, Anthropic, and DeepMind are unusually safety-conscious AI capabilities orgs (e.g., much better than FAIR or Google Brain). But reality doesn’t grade on a curve, there’s still a lot to improve, and they should still call a halt to mainstream SotA-advancing potentially-AGI-relevant ML work, since the timeline-shortening harms currently outweigh the benefits.
Thanks, and that makes sense, edited to reflect your suggestion