RobBensinger comments on I’m Buck Shlegeris, I do research and outreach at MIRI, AMA

RobBensinger Nov 19, 2019, 5:26 PM
9 points
0 ∶ 0
Oops, I saw your question when you first posted it on LessWrong but forgot to get back to you, Issa. My apologies.
I think there are two main kinds of strategic thought we had in mind when we said “details forthcoming”:
- 1. Thoughts on MIRI’s organizational plans, deconfusion research, and how we think MIRI can help play a role in improving the future — this is covered by our November 2018 update post, https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/.
- 2. High-level thoughts on things like “what we think AGI developers probably need to do” and “what we think the world probably needs to do” to successfully navigate the acute risk period.
Most of the stuff discussed in “strategic background” is about 2: not MIRI’s organizational plan, but our model of some of the things humanity likely needs to do in order for the long-run future to go well. Some of these topics are reasonably sensitive, and we’ve gone back and forth about how best to talk about them.
Within the macrostrategy / “high-level thoughts” part of the post, the densest part was maybe 7a. The criteria we listed for a strategically adequate AGI project were “strong opsec, research closure, trustworthy command, a commitment to the common good, security mindset, requisite resource levels, and heavy prioritization of alignment work”.
With most of these it’s reasonably clear what’s meant in broad strokes, though there’s a lot more I’d like to say about the specifics. “Trustworthy command” and “a commitment to the common good” are maybe the most opaque. By “trustworthy command” we meant things like:
- The organization’s entire command structure is fully aware of the difficulty and danger of alignment.
- Non-technical leadership can’t interfere and won’t object if technical leadership needs to delete a code base or abort the project.
By “a commitment to the common good” we meant a commitment to both short-term goodness (the immediate welfare of present-day Earth) and long-term goodness (the achievement of transhumanist astronomical goods), paired with a real commitment to moral humility: not rushing ahead to implement every idea that sounds good to them.
We still plan to produce more long-form macrostrategy exposition, but given how many times we’ve failed to word our thoughts in a way we felt comfortable publishing, and given how much other stuff we’re also juggling, I don’t currently expect us to have any big macrostrategy posts in the next 6 months. (Note that I don’t plan to give up on trying to get more of our thoughts out sooner than that, if possible. We’ll see.)