Technical AI Governance research at MIRI
peterbarnett
Announcing: 2026 MIRI Technical Governance Team Research Fellowship.
MIRI’s Technical Governance Team plans to run a small research fellowship program in early 2026. The program will run for 8 weeks, and include a $1200/week stipend. Fellows are expected to work on their projects 40 hours per week. The program is remote-by-default, with an in-person kickoff week in Berkeley, CA (flights and housing provided). Participants who already live in or near Berkeley are free to use our office for the duration of the program.
Fellows will spend the first week picking out scoped projects from a list provided by our team or designing independent research projects (related to our overall agenda), and then spend seven weeks working on that project under the guidance of our Technical Governance Team. One of the main goals of the program is to identify full-time hires for the team.
If you are interested in participating, please fill out this application as soon as possible (should take 45-60 minutes). We plan to set dates for participation based on applicant availability, but we expect the fellowship to begin after February 2, 2026 and end before August 31, 2026 (i.e., some 8 week period in spring/summer, 2026).
Strong applicants care deeply about existential risk, have existing experience in research or policy work, and are able to work autonomously for long stretches on topics that merge considerations from the technical and political worlds.
Unfortunately, we are not able to sponsor visas for this program.
Could/should big EA-ish coworking spaces like Constellation pay to have far-UV installed? (either on their floors specifically or for the whole building)
MATS has a very high bar these days, I’m pretty happy about there being “knock-off MATS” programs that allow people who missed the bar for MATS to demonstrate they can do valuable work.
I still kinda feel this way about Asterisk (my opinion would change if I learned that the readership wasn’t just EAs)
I made an AI generated podcast of the 2021 MIRI Conversations. There are different voices for the different participants, to make it easier and more natural to follow along with.
This was done entirely in my personal capacity, and not as part of my job at MIRI. I did this because I like listening to audio and there wasn’t a good audio version of the conversations.
Spotify link: https://open.spotify.com/show/6I0YbfFQJUv0IX6EYD1tPe
RRS: https://anchor.fm/s/1082f3c7c/podcast/rss
Apple Podcasts: https://podcasts.apple.com/us/podcast/2021-miri-conversations/id1838863198
Pocket Casts: https://pca.st/biravt3t
AI Generated Podcast for 2021 MIRI Conversations.
I made an AI generated podcast of the 2021 MIRI Conversations. There are different voices for the different participants, to make it easier and more natural to follow along with.
This was done entirely in my personal capacity, and not as part of my job at MIRI.[1] I did this because I like listening to audio and there wasn’t a good audio version of the conversations.
Spotify link: https://open.spotify.com/show/6I0YbfFQJUv0IX6EYD1tPe
RRS: https://anchor.fm/s/1082f3c7c/podcast/rss
Apple Podcasts: https://podcasts.apple.com/us/podcast/2021-miri-conversations/id1838863198
Pocket Casts: https://pca.st/biravt3t
- ^
I do think you probably should (pre-)order If Anyone Builds It, Everyone Dies though.
- ^
Thanks for your comment :) sorry you finding all the book posts annoying, I decided to post here after seeing that there hadn’t been a post on the EA Forum
I’m not actually sure what book content I’m allowed to talk about publicly before the launch. Overall the book is written much more for an audience who are new to the AI x-risk arguments (e.g., policymakers and the general public), and it is less focused on providing new arguments to people who have been thinking/reading about this for years (although I do think they’ll find it an enjoyable and clarifying read). I don’t think it’s trying to go 15 arguments deep in a LessWrong argument chain. That said, I think there is new stuff in there; the arguments are clearer than previously, there are novel framings on things, and I would guess that there’s at least some things in there that you would find new. I don’t know if I would expect people from the “Pope, Belrose, Turner, Barnett, Thornley, 1a3orn” crowd to be convinced, but they might appreciate the new framings. There will also be related online resources, which I think will cover more of the argument tree, although again, I don’t know how convincing this will be to people who are already in deep.
Here’s what Nate said in the LW announcement post:
If you’re a LessWrong regular, you might wonder whether the book contains anything new for you personally. The content won’t come as a shock to folks who have read or listened to a bunch of what Eliezer and I have to say, but it nevertheless contains some new articulations of our arguments, that I think are better articulations than we’ve ever managed before.
I would guess many people from the OpenPhil/Constellation cluster would give endorsements as the book being a good distillation. But insofar as it’s moving the frontier of online arguments about AI x-risk forward, it will mainly be by saying arguments more clearly (which imo is still progress).
Consider Preordering If Anyone Builds It, Everyone Dies
I have a bunch of disagreements with Good Ventures and how they are allocating their funds, but also Dustin and Cari are plausibly the best people who ever lived.
AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions [MIRI TGT Research Agenda]
AGI by 2028 is more likely than not
Look at the resolution criteria which is based on the specific metaculus Q, seems like a very low bar
I didn’t read the post, so this isn’t feedback. I just wanted to share my related take that I only want feedback if it’s positive, and otherwise people should keep their moronic opinions to themselves.
Update from Anthropic: https://twitter.com/AnthropicAI/status/1869139895400399183
I would guess grants made to Neil’s lab are referring to the MIT FutureTech group, which he’s the director of. FutureTech says on its website that it has received grants from OpenPhil and the OpenPhil website doesn’t seem to mention a grant to FutureTech anywhere, so I assume the OpenPhil FutureTech grant was the grant made to Neil’s lab.
I think it’s worth noting that the two papers linked (which I agree are flawed and not that useful from an x-risk viewpoint) don’t acknowledge OpenPhil funding, and so maybe the OpenPhil funding is going towards other projects within the lab.
I think that Neil Thompson has some work which is pretty awesome from an x-risk perspective (often in collaboration with people from Epoch):
From skimming his Google Scholar, a bunch of other stuff seems broadly useful as well.
In general, research forecasting AI progress and economic impacts seems great, and even better if it’s from someone academically legible like Neil Thompson.
Relatedly, I think that the “Should you work at a leading AI company?” article shouldn’t start with a pros and cons list which sort of buries the fact that you might contribute to building extremely dangerous AI.
I think “Risk of contributing to the development of harmful AI systems” should at least be at the top of the cons list. But overall this sort of reminds me of my favorite graphic from 80k:
Insofar as you are recommending the jobs but not endorsing the organization, I think it would be good to be fairly explicit about this in the job listing. The current short description of OpenAI seems pretty positive to me:
OpenAI is a leading AI research and product company, with teams working on alignment, policy, and security. You can read more about considerations around working at a leading AI company in our career review on the topic. They are also currently the subject of news stories relating to their safety work.
I think this should say something like “We recommend jobs at OpenAI because we think these specific positions may be high impact. We would not necessarily recommend working at other jobs at OpenAI (especially jobs which increase AI capabilities).”
I also don’t know what to make of the sentence “They are also currently the subject of news stories relating to their safety work.” Is this an allusion to the recent exodus of many safety people from OpenAI? If so, I think it’s misleading and gives far too positive an impression.
Do you mean the posts early last year about fundamental controllability limits?
Yep, that is what I was referring to. It does seem like you’re likely to be more careful in the future, but I’m still fairly worried about advocacy done poorly. (Although, like, I also think people should be able to advocacy if they want)
I have similar views to Marius’s comment. I did AISC in 2021 and I think it was somewhat useful for starting in AI safety, although I think my views and understanding of the problems were pretty dumb in hindsight.
AISC does seem extremely cheap (at least for the budget options). If you have like 80% on the “Only top talent matters” model (MATS, Astra, others) and 20% on the “Cast a wider net” model (AISC), I would still guess that AISC seems like a good thing to do.
My main worries here are with the negative effects. These are mainly related to the “To not build uncontrollable AI” stream; 3 out of 4 of these seem to be about communication/politics/advocacy.[1] I’m worried about these having negative effects, making the AI safety people seem crazy, uninformed, or careless. I’m mainly worried about this because Remmelt’s recent posting on LW really doesn’t seem like careful or well thought through communication. (In general I think people should be free to do advocacy etc, although please think of externalities) Part of my worry is also from AISC being a place for new people to come, and new people might not know how fringe these views are in the AI safety community.
I would be more comfortable with these projects (and they would potentially still be useful!) if they were focused on understanding the things they were advocating for more. E.g. a report on “How could lawyers and coders stop AI companies using their data?”, rather than attempting to start an underground coalition.
All the projects in the “Everything else” streams (run by Linda) seem good or fine, and likely a decent way to get involved and start thinking about AI safety. Although, as always, there is a risk of wasting time with projects that end up being useless.
[ETA: I do think that AISC is likely good on net.]
- ^
The other one seems like a fine/non-risky project related to domain whitelisting.
- ^
We may be running multiple smaller cohorts rather than one big one, if that’s what maximizes the ability of strong candidates to participate.
The single most important factor in deciding the timing is the window in which strong candidates are available, and the target size for the cohort is small enough (5-20 depending on strength of applicants) that the availability of a single applicant is enough to sway the decision. It’s specifically cases like yours that we’re intending to accommodate. Please apply!