MikhailSamin

Karma: 497

Are you interested in AI X-risk reduction and strategies? Do you have experience in comms or policy? Let’s chat!

aigsi.org develops educational materials and ads that most efficiently communicate core AI safety ideas to specific demographics, with a focus on producing a correct understanding of why smarter-than-human AI poses a risk of extinction. We plan to increase and leverage understanding of AI and existential risk from AI to impact the chance of institutions addressing x-risk.

Early results include ads that achieve a cost of $0.10 per click (to a website that explains the technical details of why AI experts are worried about extinction risk from AI) and $0.05 per engagement on ads that share simple ideas at the core of the problem.

Personally, I’m good at explaining existential risk from AI to people, including to policymakers. I have experience of changing minds of ³⁄₄ people I talked to at an e/acc event.

Previously, I got 250k people to read HPMOR and sent 1.3k copies to winners of math and computer science competitions (including dozens of IMO and IOI gold medalists); have taken the GWWC pledge; created a small startup that donated >100k$ to effective nonprofits.

I have a background in ML and strong intuitions about the AI alignment problem. I grew up running political campaigns and have a bit of a security mindset.

My website: contact.ms

You’re welcome to schedule a call with me before or after the conference: contact.ms/ea30

MikhailSamin May 23, 2025, 2:19 AM
26 points
1 ∶ 0
in reply to: MichaelDickens’s comment on: Samin’s Shortform
The original commitment was (IIRC!) about defining the thresholds, not about mitigations. I didn’t notice ASL-4 when I briefly checked the RSP table of contents earlier today and I trusted the reporting on this from Obsolete. I apologized and retracted the take on LessWrong, but forgot I posted it here as well; want to apologize to everyone here, too, I was wrong.

MikhailSamin May 22, 2025, 11:40 PM
1 point
1 ∶ 2
on: Samin’s Shortform
In RSP, Anthropic committed to define ASL-4 by the time they reach ASL-3.
With Claude 4 released today, they have reached ASL-3. They haven’t yet defined ASL-4.
Turns out, they have quietly walked back on the commitment. The change happened less than two months ago and, to my knowledge, was not announced on LW or other visible places unlike other important changes to the RSP. It’s also not in the changelog on their website; in the description of the relevant update, they say they added a new commitment but don’t mention removing this one.
Anthropic’s behavior is not at all the behavior of a responsible AI company. Trained a new model that reaches ASL-3 before you can define ASL-4? No problem, update the RSP so that you no longer have to, and basically don’t tell anyone. (Did anyone not working for Anthropic know the change happened?)
When their commitments go against their commercial interests, we can’t trust their commitments.
You should not work at Anthropic on AI capabilities.

MikhailSamin Apr 19, 2025, 8:35 PM
3 points
0 ∶ 0
on: Contracting Opportunity: Be a shortform video editor for the new 80,000 Hours Video Program (even if you haven’t edited before!)
This is awesome to see!

MikhailSamin Mar 26, 2025, 8:30 PM
3 points
2 ∶ 2
on: Samin’s Shortform
I do not believe Anthropic as a company has a coherent and defensible view on policy. It is known that they said words they didn’t hold while hiring people (and they claim to have good internal reasons for changing their minds, but people did work for them because of impressions that Anthropic made but decided not to hold). It is known among policy circles that Anthropic’s lobbyists are similar to OpenAI’s.
From Jack Clark, a billionaire co-founder of Anthropic and its chief of policy, today:
Dario is talking about countries of geniuses in datacenters in the context of competition with China and a 10-25% chance that everyone will literally die, while Jack Clark is basically saying, “But what if we’re wrong about betting on short AI timelines? Security measures and pre-deployment testing will be very annoying, and we might regret them. We’ll have slower technological progress!”
This is not invalid in isolation, but Anthropic is a company that was built on the idea of not fueling the race.
Do you know what would stop the race? Getting policymakers to clearly understand the threat models that many of Anthropic’s employees share.
It’s ridiculous and insane that, instead, Anthropic is arguing against regulation because it might slow down technological progress.

How to Give in to Threats (without incentivizing them)

MikhailSaminMar 24, 2025, 12:53 AM

9 points

0 comments EA link

MikhailSamin Mar 24, 2025, 12:49 AM
1 point
0 ∶ 0
on: Discussion Thread: Existential Choices Debate Week
Our lightcone is an enormous endowment. We get to have a lot of computation, in a universe with simple physics. What these resources are spent on matters a lot.
If we get AI right (create a CEV-aligned ASI), we get most of the utility out of these resources automatically (almost tautologically, see CEV: to the extent after considering all the arguments and reflecting we think we’re ought to value something, this is what CEV points to as an optimization target). If it takes us a long time to get AI right, we lose a literal galaxy of resources every year, but this is approximately nothing in relative terms.
If we die because of AI, we get ~0% of the possible value/max CEV.
Increasing the chance AI goes well is what’s important. Work to marginally shift the % around the maximum seems relatively unimportant compared to the chance AI goes well. Whether we die because of AI is the largest input.
(I find negative % of CEV very implausible because it almost always doesn’t make sense to spend resources on penalizing other agent’s utility if the other agent is smart enough to make it not worth it and for other, more speculative reasons.)

Superintelligence’s goals are likely to be random

MikhailSaminMar 14, 2025, 1:17 AM

2 points

0 comments EA link

MikhailSamin Feb 10, 2025, 9:58 AM
4 points
0 ∶ 0
in reply to: Dylan Richardson’s comment on: No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR
21k copies/61k hardcover books, each book ~630 pages long, yep!
I agree that most of the impact is from a fun attraction to adjacent ideas, not from what the book itself communicates.
No connection to the grant, yep.
It was a crowdfunding campaign, and I committed to spend at least as much on books and shipping costs (including to libraries and for educational/science popularization purposes) as we’ve received through the campaign. We’ve then run out of that money and had to spend our own (about 2.2m rubles so far) to send the books to winners of olympiads and libraries and also buy a bunch of copies of Human Compatible and The Precipice (we were able to get discounted prices). On average, it costs us around $5 to deliver a copy to a door.
We’ve distributed around 15k copies in total so far, most to the crowdfunding participants.

MikhailSamin Jan 31, 2025, 1:18 PM
6 points
1 ∶ 1
in reply to: Greg_Colbourn ⏸️ ’s comment on: Where I Am Donating in 2024
I wouldn’t include OpenAI/Anthropic’s lobbying efforts in the “EA’s lobbying behind closed doors” category. What evidence do you have for movement in that direction among actual EA orgs?

MikhailSamin Jan 24, 2025, 12:06 AM
3 points
0 ∶ 0
on: No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR
I’m confused about this discrepancy between LessWrong and EA Forum. (Feedback is welcome!)

MikhailSamin Jan 23, 2025, 11:46 PM
33 points
3 ∶ 0
in reply to: Guy Raveh’s comment on: No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR
Anecdotally, approximately everyone who’s now working on AI safety with Russian origins got into it because of HPMOR. Just a couple of days ago, an IOI gold medalist reached out to me, they’ve been going through ARENA.
HPMOR tends to make people with that kind of background act more on trying to save the world. It also gives some intuitive sense for some related stuff (up to “oh, like the mirror from HPMOR?”), but this is a lot less central than giving people the ~EA values and making them actually do stuff.
(Plus, at this point, the book is well-known enough in some circles that some % of future Russian ML researchers would be a lot easier to alignment-pill and persuade to not work on something that might kill everyone or prompt other countries to build something that kills everyone.
Like, the largest Russian broker decided to celebrate the New Year by advertising HPMOR and citing Yudkowsky.)
I’m not sure how universal this is- the kind of Russian kid who is into math/computer science is the kind of kid who would often be into the HPMOR aesthetics- but it seems to work.
I think many past IMO/IOI medalists are generally very capable and can help, and it’s worth looking at the list of them and reaching out to people who’ve read HPMOR (and possibly The Precipice/Human Compatible) and getting them to work on AI safety.

MikhailSamin Jan 23, 2025, 8:02 PM
4 points
0 ∶ 0
on: No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR
We also have 6k more copies (18k hard-cover books) left. We have no idea what to do with them. Suggestions are welcome.
Here’s a map of Russian libraries that requested copies of HPMOR, and we’ve sent 2126 copies to:
Sending HPMOR to random libraries is cool, but I hope someone comes up with better ways of spending the books.

MikhailSamin Jan 23, 2025, 7:49 PM
58 points
4 ∶ 1
in reply to: titotal’s comment on: No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR
Nope. The grant you linked to was not in any way connected to me or the books I’ve printed. ~~A couple of years ago,~~ (edit: in 2019) I was surprised to learn about that grant; the claim that there was coordination with “the team behind the highly successful Russian printing of HPMOR” (which is me/us) is false. (I don’t think the recipients of the grant you’re referencing even have a way to follow up with the people they gave books. Also, as IMO 2020 was cancelled, they should’ve returned most of the grant.)
EA money was not involved in printing the books that I have.
We’ve started sending books to olympiad winners in December 2022. All of the copies we’ve sent have been explicitly requested, often together with copies of The Precipice and/or Human Compatible, sometimes after having already read it (usually due to my previous efforts), usually after having seen endorsements by popular science people and literary critics.
I have a very different model of how HPMOR affects this specific audience and I think this is a lot more valuable than selling the books^[1] → donating anywhere else.
1. ^
  (we can’t actually sell these books due to copyright-related constraints.)

No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR

MikhailSaminJan 23, 2025, 4:40 PM

32 points

10 comments EA link

MikhailSamin Dec 31, 2024, 4:10 PM
5 points
1 ∶ 0
in reply to: Greg_Colbourn ⏸️ ’s comment on: Where I Am Donating in 2024
I would not be advocating for inaction. I do advocate for high-integrity actions and comms, though.
I occasionally see these people publicly expressing that the rationalists’ standards of honesty are impossible to meet and saying that they’re talking in ways rationalists consider to be potentially manipulative.
It would be great if people who are actually doing things tried to avoid manipulations and dishonesty.
Being manipulative is the kind of thing that backfires and leads to the deaths of everyone in a short-timelines world.
Until a year ago, I was hoping EA had learned some lessons from what happened with SBF, but unfortunately, we don’t seem to have.
If you lie to try to increase the chance of a global AI pause, our world looks less like a surviving world, not more like it.

MikhailSamin Dec 17, 2024, 3:06 PM
7 points
0 ∶ 0
in reply to: MikhailSamin’s comment on: Where I Am Donating in 2024
Update: I’ve received feedback from the SFF round; we got positive evaluations from two recommenders (so my understanding is the funding allocated to us in the s-process was lower than the speculation grant) and one piece of negative feedback. The negative feedback mentioned that our project might lead to EA getting swamped by normies with high inferential distances, which can have negative consequences; and that because of that risk, “This initiative may be worthy of some support, but unfortunately other orgs in this rather impressive lineup must take priority”.
If you’re considering donating to AIGSI/AISGF, please reach out! My email is ms@contact.ms.

MikhailSamin Nov 21, 2024, 12:35 AM
34 points
4 ∶ 5
on: Where I Am Donating in 2024
Note that we’ve only received a speculation grant recommended by the SFF and haven’t received any s-process funding. This should be a downward update on the value of our work and an upward update on a marginal donation’s value for our work.
I’m waiting for feedback from SFF before actively fundraising elsewhere, but I’d be excited about getting in touch with potential funders and volunteers. Please message me if you want to chat! My email is ms@contact.ms, and you can find me everywhere else or send a DM on EA Forum.
On other organizations, I think:
- MIRI’s work is very valuable. I’m optimistic about what I know about their comms and policy work. As Malo noted, they work with policymakers, too. Since 2021, I’ve donated over $60k to MIRI. I think they should be the default choice for donations unless they say otherwise.
- OpenPhil risks increasing polarization and making it impossible to pass meaningful legislation. But while they make IMO obviously bad decisions, not everything they/Dustin fund is bad. E.g., Horizon might place people who actually care about others in places where they could have a huge positive impact on the world. I’m not sure, I would love to see Horizon fellows become more informed on AI x-risk than they currently are, but I’ve donated $2.5k to Horizon Institute for Public Service this year.
- I’d be excited about the Center for AI Safety getting more funding. SB-1047 was the closest we got to a very good thing, AFAIK, and it was a coin toss on whether it would’ve been signed or not. They seem very competent. I think the occasional potential lack of rigor and other concerns don’t outweigh their results. I’ve donated $1k to them this year.
- By default, I’m excited about the Center for AI Policy. A mistake they plausibly made makes me somewhat uncertain about how experienced they are with DC and whether they are capable of avoiding downside risks, but I think the people who run it are smart and have very reasonable models. I’d be excited about them having as much money as they can spend and hiring more experienced and competent people.
- PauseAI is likely to be net-negative, especially PauseAI US. I wouldn’t recommend donating to them. Some of what they’re doing is exciting (and there are people who would be a good fit to join them and improve their overall impact), but they’re incapable of avoiding actions that might, at some point, badly backfire.
  I’ve helped them where I could, but they don’t have good epistemics, and they’re fine with using deception to achieve their goals.
  E.g., at some point, their website represented the view that it’s more likely than not that bad actors would use AI to hack everything, shut down the internet, and cause a societal collapse (but not extinction). If you talk to people with some exposure to cybersecurity and say this sort of thing, they’ll dismiss everything else you say, and it’ll be much harder to make a case for AI x-risk in the future. PauseAI Global’s leadership updated when I had a conversation with them and edited the claims, but I’m not sure they have mechanisms to avoid making confident wrong claims. I haven’t seen evidence that PauseAI is capable of presenting their case for AI x-risk competently (though it’s been a while since I’ve looked).
  I think PauseAI US is especially incapable of avoiding actions with downside risks, including deception^[1], and donations to them are net-negative. To Michael, I would recommend, at the very least, donating to PauseAI Global instead of PauseAI US; to everyone else, I’d recommend ideally donating somewhere else entirely.
- Stop AI’s views include the idea that a CEV-aligned AGI would be just as bad as an unaligned AGI that causes human extinction. I wouldn’t be able to pass their ITT, but yep, people should not donate to Stop AI. The Stop AGI person participated in organizing the protest described in the footnote.
1. ^
  In February this year, PauseAI US organized a protest against OpenAI “working with the Pentagon”, while OpenAI only collaborated with DARPA on open-source cybersecurity tools and is in talks with the Pentagon about veteran suicide prevention. Most participants wanted to protest OpenAI because of AI x-risk and not because of Pentagon, but those I talked to have said they felt it was deceptive upon discovering the nature of OpenAI’s collaboration with the Pentagon. Also, Holly threatened me trying to prevent the publication of a post about this and then publicly lied about our conversations, in a way that can be easily falsified by looking at the messages we’ve exchanged.

MikhailSamin Nov 19, 2024, 9:54 PM
2 points
1 ∶ 1
on: Samin’s Shortform
(Haven’t thought about this really, might be very wrong, but have this thought and seems good to put out there.) I feel like putting 🔸 at the end of social media names might be bad. I’m curious what the strategy was.
- The willingness to do this might be anti-correlated with status. It might be a less important part of identity of more important people. (E.g., would you expect Sam Harris, who is a GWWC pledger, to do this?)
- I’d guess that ideally, we want people to associate the GWWC pledge with role models (+ know that people similar to them take the pledge, too).
- Anti-correlation with status might mean that people will identify the pledge with average though altruistic Twitter users, not with cool people they want to be more like.
- You won’t see a lot of e/accs putting the 🔸 in their names. There might be downside effects of perception of a group of people as clearly outlined and having this as an almost political identity; it seems bad to have directionally-political properties that might do mind-killing things both to people with 🔸 and to people who might argue with them.

MikhailSamin Jun 17, 2024, 10:08 PM
0 points
2 ∶ 0
in reply to: Dustin Moskovitz’s comment on: [Linkpost] An update from Good Ventures
Can you give an example of a non-PR risk that you had in mind?

MikhailSamin Jun 1, 2024, 2:48 PM
5 points
0 ∶ 0
on: Drexler’s Nanosystems is now available online
Uhm, for some reason I have four copies of this crosspost on my profile?

MikhailSamin

How to Give in to Threats (with­out in­cen­tiviz­ing them)

Su­per­in­tel­li­gence’s goals are likely to be random

No one has the ball on 1500 Rus­sian olympiad win­ners who’ve re­ceived HPMOR

How to Give in to Threats (without incentivizing them)

Superintelligence’s goals are likely to be random

No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR