MikhailSamin

Karma: 256

I’m good at explaining alignment to people in person, including to policymakers.

I got 250k people to read HPMOR and sent 1.3k copies to winners of math and computer science competitions; have taken the GWWC pledge; created a small startup that donated >100k$ to effective nonprofits.

I have a background in ML and strong intuitions about the AI alignment problem. In the past, I studied a bit of international law (with a focus on human rights) and wrote appeals that won cases against the Russian government in Russian courts. I grew up running political campaigns.

I’m interesting in chatting to potential collaborators and comms allies.

My website: https://contact.ms

Schedule a call with me: https://contact.ms/ea30

MikhailSamin 24 Jul 2022 17:15 UTC
5 points
0 ∶ 0
on: Samin’s Shortform
How do effectiveness estimates change if everyone saved dies in 10 years?
“Saving lives near the precipice”
Has anyone made comparisons of the effectiveness of charities conditional on the world ending in, e.g., 5-15 years?
[I’m highly uncertain about this, and I haven’t done much thinking or research]
For many orgs and interventions, the impact estimations would possibly be very different from the default ones made by, e.g., GiveWell. I’d guess the order of the most effective non-longtermist charities might change a lot as a result.
It would be interesting to see how it changes as at least some estimates account for the world ending in n years.
Maybe one could start with updating GiveWell’s estimates: e.g., for DALYs, one would need to recalculate the values in GiveWell’s spreadsheets derived from the distributions that are capped or changed as a result of the world ending (e.g., life expectancy); for estimates of relative values of averting deaths at certain ages, one would need to estimate and subtract something representing that the deaths still come at (age+n). The second-order and long-term effects would also be different, but it’s possibly more time-consuming to estimate the impact there.
It seems like a potentially important question since many people have short AGI timelines in mind. So it might be worthwhile to research that area to give people the ability to weigh different estimates of charities’ impacts by their probabilities of an existential catastrophe.
Please let me know if someone already has worked this out or is working on this or if there’s some reason not to talk about this kind of thing, or if I’m wrong about something.
What links here?
- Neartermists should consider AGI timelines in their spending decisions by Tristan Cook (26 Jul 2022 17:01 UTC; 67 points)

MikhailSamin 26 Jul 2022 16:08 UTC
1 point
0 ∶ 0
in reply to: Tom Barnes’s comment on: Samin’s Shortform
I think discounting QALYs/DALYs due to the probability of doom makes sense if you want a better estimate of QALYs/DALYs; but it doesn’t help with estimating the relative effectiveness of charities and doesn’t help to allocate the funding better.
(It would be nice to input the distribution of the world ending in the next n years and get the discounted values. But it’s the relative cost of ways to save a life that matters; we can’t save everyone, so we want to save the most lives and reduce suffering the most, the question of how to do that means that we need to understand what our actions lead to so we can compare our options. Knowing how many people you’re saving is instrumental to saving the most people from the dragon. If it costs at least $15000 to save a life, you don’t stop saving lives because that’s too much; human life is much more valuable. If we succeed, you can imagine spending stars on saving a single life. And if we don’t, we’d still like to reduce the suffering the most and let as many people as we can live for as long as humanity lives; for that, we need estimates of the relative value of different interventions conditional on the world ending in n years with some probability.)

Saving lives near the precipice

MikhailSamin29 Jul 2022 15:08 UTC

18 points

10 comments3 min readEA link

MikhailSamin 29 Jul 2022 15:14 UTC
4 points
0 ∶ 0
on: Saving lives near the precipice: we’re doing it wrong?
Comments and DMs are welcome, including on the quality of writing (I’m not a native English speaker and would appreciate any corrections)

MikhailSamin 30 Jul 2022 9:23 UTC
1 point
0 ∶ 0
in reply to: MikhailSamin’s comment on: Samin’s Shortform
Wrote a post: https://forum.effectivealtruism.org/posts/hz2Q8GgZ28YKLazGb/saving-lives-near-the-precipice-we-re-doing-it-wrong

MikhailSamin 30 Jul 2022 9:57 UTC
5 points
0 ∶ 0
on: Neartermists should consider AGI timelines in their spending decisions
Thanks for writing this!

Conditional on an aligned superintelligence appearing in a short time, there could be interventions that prevent or delay deaths until it appears and probably save these lives and have a lot of value (though it’s hard to come up with concrete examples. Speculating without thinking about the actual costs, providing HIV therapy that’s possibly not cost-effective if you to do it for a lifetime but is cost-effective if you do it for a year or maybe freezing people when they die or providing mosquito nets that stop working in a year but are a lot cheaper sound kind of like it)

MikhailSamin 30 Jul 2022 10:01 UTC
1 point
0 ∶ 0
in reply to: Tristan Cook’s comment on: Saving lives near the precipice: we’re doing it wrong?
Awesome!

I didn’t consider the spending speed here. It highlights another important part of the analysis one should make when considering neartermist donations conditional on the short timelines. Dependent on humanity solving alignment, you not only want to spend the money before a superintelligence appears but also might maximize the impact by, e.g., delaying deaths until then

MikhailSamin 30 Jul 2022 17:55 UTC
1 point
0 ∶ 0
in reply to: Roddy MacSween’s comment on: Saving lives near the precipice: we’re doing it wrong?
It is quite likely that you’re right! I think it’s just something that should be explicitly thought about, it seems like an uncertainty that wasn’t really noticed. If x-risk is in the next few decades, some of the money currently directed to the interventions fighting deaths and suffering might be allocated to charities that do it better.

MikhailSamin 24 Aug 2022 22:52 UTC
17 points
1 ∶ 0
on: Open Phil is seeking bilingual people to help translate EA/EA-adjacent web content into non-English languages
We’re making translations of a lot of EA content into Russian and have an experience that might be relevant to countries where people mostly can’t speak English.
We learned that you need introspective people to evaluate or do the translations.
The best professional translators in our language are mostly hired by large publishing houses and have long-term commitments for many books to come (and the publishing houses aren’t able to lease them for our projects).
Surprisingly, looking at a translation that looks like a good text in our language but has significant mistakes, most people wouldn’t notice the mistakes. Most people who aren’t the best professional translators don’t actively try to recognize what exactly they just read. When a translation expresses something really different from what the original text conveys, but the words are similar enough, people just don’t notice it.
Google Translate was better at not making mistakes than 80% of translators that sent us a translated test segment. We ended up hiring two translators from the EA/LW community.
A lot of the 80,000 Hours’ articles are highly optimized for conveying a correct understanding, and translation errors might significantly reduce the value of the texts.

You won’t solve alignment without agent foundations

MikhailSamin6 Nov 2022 8:07 UTC

14 points

0 comments1 min readEA link

[Question] I have thousands of copies of HPMOR in Russian. How to use them with the most impact?

MikhailSamin27 Dec 2022 11:07 UTC

39 points

10 comments1 min readEA link

MikhailSamin 14 Jan 2023 8:27 UTC
15 points
4 ∶ 0
in reply to: Bob Jacobs’s comment on: CEA statement on Nick Bostrom’s email
I don’t think one of the claims, that “Twin studies are flawed in methodology. Twins, even identical twins, simply do not have exactly the same DNA”, is true. As I see, it is not supported by the link and the study.

The difference of 5.2 out of 6 billion letters that identical twins have on average is not something that makes their DNA distinct enough to make the correlations between being identical tweens or not and having something in common more often to be automatically invalid.

One of the people involved in the study is cited: “Such genomic differences between identical twins are still very rare. I doubt these differences will have appreciable contribution to phenotypic [or observable] differences in twin studies.”

Twin studies being something we should be able to rely on seems like a part of the current scientific view, and some EA decisions might take such studies into consideration.

I think it’s important not to compromise our intellectual integrity even when we debunk foundations for awful and obviously wrong beliefs that are responsible for so much unfairness and suffering that exist in our world and for so many deaths.

I think if the community uses words that are persuasive but don’t contain actually good evidence, then even if we’re arguing for the truth that’s important and impactful to spread, in the long-term, this might lead to people putting less trust in any of our words arguing for the truth and more people believing something harmful and untrue. And on the internet, there are a lot of words containing bad arguments for the truth because it’s easy for people to be in the mode of finding persuasive arguments, which don’t necessarily have to be actually good evidence.

I think it’s really important for the EA community to be epistemically honest and talk about the actual reasons we have for believing something, instead of trying to find the most persuasive list of reasons for believing in what we believe in and just copying it without verifying that all the reasons are good and should update people in the claimed direction.

MikhailSamin 14 Jan 2023 10:11 UTC
6 points
3 ∶ 1
in reply to: Bob Jacobs’s comment on: CEA statement on Nick Bostrom’s email
Oops! Sorry, I only discovered the second link; but before writing my comment, I looked up the first myself.

I’m not a biologist and will probably defer to any biologist entering this thread and commenting on the twin studies.

Twins (mostly, as the linked study shows) do not have exactly the same DNA. But it doesn’t seem to be relevant. The relevant assumption is that there’s almost no difference between the DNAs of “identical” twins and a large difference between the DNAs of non-identical reared-together twins, which is true despite a couple of random mutations per 6 billion letters.

The next two linked articles are paywalled. Is there somewhere to read them?

The third is a review of a short book, available after a sign-up, and it says that “some studies on twins are good, some bad”, and the author feels, but “doesn’t actually know” that the reviewed one is good. The reviewed book performed a study on twins and noticed there isn’t much of a difference between the correlation of the similarity of many personality traits with whether people are identical twins, and concluded that, since you’d expect to see a difference if the traits have different degree of heritability, many personality traits are results of the environment.

How is this an evidence that twin studies are flawed and shouldn’t be used? If that’s a correct study, it’s just evidence that personality traits are mostly formed by environment (which is something I already believe and have believed for the most of my life), but, e.g., why would this be relevant for a discussion of whether or not some disease has a genetic component to it, when a twin study shows that there is?

It’s important to carefully compare the numbers; but obviously there are things that identical twins have in common more often then non-identical twins, because these things are heritable at to larger or lesser degree; like hair color or height.

Of course, any study makes some underrepresentation of humanity. But if your study is about the degree of heredity of something and not about twins, why would this matter? If there’s a difference between adopted identical and non-identical twins that’s better explained by genetics (e.g., non-identical twins would have a different height more often), why does it matter how well they represent twins in general? Unless you’re studying how likely people are to be adopted, I don’t understand the claim.

The last link is paywalled, but again, why would this affect the difference between identical twins and non-identical twins? Until a year ago, I kept secret that I’m bi and would’ve kept it secret from scientists; but I don’t think this kind of thing affects conclusions you’d make if identical twins answered identically to some question more often than non-identical twins (e.g., imagine a society where people with green eyes are persecuted and a lot of them use contact lenses. Some would still say the truth, in confidence, to scientists; and the number of identical twins telling the same answer would be greater than the amount of non-identical twins telling the same answer, and the scientists will correctly infer this to be evidence for the heritability of eye color, even though a lot of twins would lie about their eye color).

So while it’s possible to just compare full DNAs and account for lots of different factors (all sorts of various environmental conditions that might be different between the subjects of the study) to find out whether DNA correlates with eye color, it’s much easier to do a twin study, and a strong correlation there will be a strong evidence

MikhailSamin 17 Jan 2023 15:25 UTC
−3 points
5 ∶ 8
in reply to: MissionCriticalBit’s comment on: CEA statement on Nick Bostrom’s email
I don’t want to engage with your arguments. I strongly think you’re wrong, but it seems much less relevant to what I can contribute (or generally want to engage with) than the fact that you’ve posted that comment and people have upvoted it.
I don’t understand how this can happen on the EA Forum. Why would anyone believing in this and wanting to do good promote this?
If anyone here does believe in ideas that have caused a great amount of harm and will cause more if spread, they should not spread them. If that’s not the specific arguments that you think might be better and should be improved in such and such way but the views that you’re arguing about, don’t! If you want to do good, why would you ever, in our world, spread these views? If the impact of spreading these views is more tragedies happening, more suffering, and more people dying early, please consider these views an infohazard and don’t even talk about them unless you’re absolutely sure your views are not going to spread to people who’ll become more intolerant- or more violent.
If you, as a rationalist, came up with a Basilisk that you thought actually works, thinking that it’s the truth that it works should be a really strong reason not to post it or talk about it, ever.
The feeling of successfully persuading people (or even just engaging in interesting arguments), as good as it might be, isn’t worth a single tragedy that will result from spreading this kind of ideas. Please think about the impact of your words. If people persuaded by what you say might do harm, don’t.
One day, if the kindest of rationalists do solve alignment and enough time passes for humanity to become educated and caring, the AI will tell us what the truth is without a chance of it doing any harm. If you’re right, you’ll be able to say, “I was right all along, and all these woke people were not, and my epistemology was awesome”. Before then, please, if anyone might believe you, don’t tell them what you consider to be the truth.

MikhailSamin 6 Jul 2023 7:38 UTC
6 points
2 ∶ 4
on: OpenAI is starting a new “Superintelligence alignment” team and they’re hiring

Our goal is to solve the core technical challenges of superintelligence alignment in four years.

This is a great goal! I don’t believe they’ve got what it takes to achieve it, though. Safely directing a superintelligent system at solving alignment is an alignment-complete problem. Building a human-level system that does alignment research safely on the first try is possible, running more than one copy of this system at a superhuman speed safely is something no one has any idea how to even approach, and unless this insanity is stopped so we have many more than four years to solve alignment, we’re all dead

Please wonder about the hard parts of the alignment problem

MikhailSamin11 Jul 2023 17:02 UTC

7 points

0 comments1 min readEA link

A transcript of the TED talk by Eliezer Yudkowsky

MikhailSamin12 Jul 2023 12:12 UTC

39 points

0 comments1 min readEA link

MikhailSamin 30 Aug 2023 13:10 UTC
1 point
0 ∶ 1
on: Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
I don’t know much about philosophy to participate in the zombies & animal consciousness debate meaningfully. (It takes me hours to get people who think there’s a 30% chance microns have qualia to start to understand the reason why they’re likely not. And the word “consciousness” is a not a good one, as people mean totally different things when they use it. And Yudkowsky eats fish but not octopi and some other seafood, because he think there’s a high enough chance octopi have consciousness. But, this is not my area, this is just something that’s fun to talk about.)

But the critique of FDT doesn’t seem valid at all.

If a simulated copy of you gives in to a threat, it makes sense to identically blackmail the real you. If you don’t, it doesn’t make sense to spend resources on reducing your utility.

If you’re the kind of agent who gives in to blackmail, everyone across the multiverse will extract everything you have from you, and you’ll get quite a negative utility for all the threats you didn’t have resources left to give in to. If you don’t give in to threats, you’ll get much less threats and won’t lose as much.

If you’re an AI trained with machine learning that does what a logical decision theory says you should do, you get a lower loss then an AI that does something else, and you get selected.

MikhailSamin 11 Sep 2023 12:06 UTC
1 point
0 ∶ 0
on: AI pause/governance advocacy might be net-negative, especially without focus on explaining the x-risk
Back in January, Michael Cohen talked at the House of Commons about the possibility of AI killing everyone. At this point, when policymakers want to understand the problem and turn to you, downplaying x-risk doesn’t make them listen to you more; it makes them less worried and more dismissive. I think a lot of AI governance people/think tanks haven’t updated on this.

MikhailSamin 16 Oct 2023 0:46 UTC
8 points
0 ∶ 0
on: Sharing Information About Nonlinear
Have there been any updates from Nonlinear? Have they written a response?

MikhailSamin

How do effectiveness estimates change if everyone saved dies in 10 years?

Sav­ing lives near the precipice

You won’t solve al­ign­ment with­out agent foundations

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

Please won­der about the hard parts of the al­ign­ment problem

A tran­script of the TED talk by Eliezer Yudkowsky

Saving lives near the precipice

You won’t solve alignment without agent foundations

[Question] I have thousands of copies of HPMOR in Russian. How to use them with the most impact?

Please wonder about the hard parts of the alignment problem

A transcript of the TED talk by Eliezer Yudkowsky