Someone should write a good, linkable online resource describing the concept of the long reflection. It’s very strange that there isn’t a simple post/webpage that I can link to that gives a good, medium-depth description.
Hi Peter, good to meet you! If you are interested in the long reflection you might be interested in my research which I will link here which is on the broader class of interventions that the long reflection belongs, I really appreciate any feedback or comments on it.
Additionally, if this is something you’re interested in, you might be interested in this as a future forum debate topic. I raised it as a potential candidate here, I’m really hoping it gets enough initial upvotes to be a finalist candidate as I really think it’s an important crux for whether or not we achieve a highly valuable future!
There’s now also the related concept of viatopia, which is maybe a better concept/term. Not sure what the very best links on that are but this one seems a good starting point.
I’ve seen AI-based animal communication technologies starting to be involved in some EA events / discussions (e.g. https://www.earthspecies.org/ ). I’m worried these initiatives may be actively negative, and I’m wondering if anyone has / will articulate a stronger defense of why they’re good?
The high-level argument I’ve heard is that communicating with animals will make humans be more empathetic towards them. But I don’t see why this would be the most likely outcome:
Humans are already fairly empathetic to animals, especially around things that we’d consider important welfare issues. We don’t need a hen to articulately describe why she’d prefer not to have her beak cut off or be kept in a cage, I think it would be fairly obvious to most people.
Animals might become less sympathetic if we knew what they were saying. It seems possible that most of their thoughts and words are about food, sex, and ingroup / outgroup dynamics.
A similar argument is that communication would allow us to see that animals are actually intelligent, but again I don’t see why this is necessarily the case. If their thoughts are things people would generally consider crude, it’s possible people would become more confident in their lack of intelligence (despite still deserving moral consideration).
More importantly, a large effect of being able to communicate with animals is that they’ll become more useful to humans. If animals had political power or legal rights, this might open the door to mutually beneficial trade. But in reality, they don’t have these things, so it seems more likely that this would allow humans to exploit these species more easily. They reason chickens, cows, and pigs are in such a bad state is because they’re very useful to humans, and I’m worried animal communication technologies will subject more species to similar fates.
Here’s my current four-point argument for AI risk/danger from misaligned AIs.
We are on the path of creating intelligences capable of being better than humans at almost all economically and militarily relevant tasks.
There are strong selection pressures and trends to make these intelligences into goal-seeking minds acting in the real world, rather than disembodied high-IQ pattern-matchers.
Unlike traditional software, we have little ability to know or control what these goal-seeking minds will do, only directional input.
Minds much better than humans at seeking their goals, with goals different enough from our own, may end us all, either as a preventative measure or side effect.
Request for feedback: I’m curious whether there are points that people think I’m critically missing, and/or ways that these arguments would not be convincing to “normal people.” Original goal.
I think that your list is really great! As a person who try to understand misaligned AI better, this is my arguments:
The difference between a human and an AGI might be greater than the difference between a human and a mushroom.
If the difference is that great, it will probably not make much difference between a cow and a human. The way humans treat other animals, the planet and each other makes it hard to see how we could possibly create AI alignment that is willing to save a creature like us.
If AGI has self-perservation, we are the only creatures that can threat their existence. Which means that they might want to make sure that we didn’t exist anymore just to be safe.
AGI is a thing that we know nothing about. If it cane a spaceship with aliens, we would probably use enormous resources to make sure it would not threat our planet. But now we are creating this alien creature ourselves and don’t do very much to make sure it isn’t a threat to our planet.
2 thoughts here just thinking about persuasiveness. I’m not quite sure what you mean by normal people and also if you still want your arguments to be actually arguments or just persuasion-max.
show don’t tell for 1-3
For anyone who hasn’t intimately used frontier models but is willing to with an open mind, I’d guess you should just push them to use and actually engage mentally with them and their thought traces, even better if you can convince them to use something agentic like CC.
Ask and/or tell stories for 4
What can history tell us about what happens when a significantly more tech savy/powerful nation finds another one?
no “right” answer here though the general arc of history is that significantly more powerful nations capture/kill/etc.
What would it be like to be a native during various european conquests in the new world (esp ignoring effects of smallpox/disease to the extent you can)?
Incan perspective? Mayan?
I especially like Orellena’s first expedition down the amazon. As far as I can tell, Orellena was not especially bloodthirsty, had some interest/respect for natives. Though he is certainly misaligned with the natives.
Even if Orellana is “less bloodthirsty,” you still don’t want to be a native on that river. You hear fragmented rumors—trade, disease, violence—with no shared narrative; you don’t know what these outsiders want or what their weapons do; you don’t know whether letting them land changes the local equilibrium by enabling alliances with your enemies; and you don’t know whether the boat carries Orellana or someone worse.
Do you trade? attack? flee? coordinate? Any move could be fatal, and the entire situation destabilizes before anyone has to decide “we should exterminate them.”
and for all of these situations you can actually see what happened (approximately) and usually it doesn’t end well.
Why is AI different?
not rhetorical and gives them space to think in a smaller, more structured way that doesn’t force an answer.
I have many disagreements, but I’ll focus on one: I think point 2 is in contradiction with points 3 and 4. To put it it plainly: the “selection pressures” go away pretty quickly if we don’t have reliable methods of knowing or controlling what the AI will do, or preventing it from doing noticeably bad stuff. That applies to the obvious stuff like if AI tries to prematurely go skynet, but it also applies to more mundane stuff like getting an AI to act reliably more than 99% of the time.
I believe that if we manage to control AI enough to make widespread rollout feasible, then it’s pretty likely we’ve already solved alignment well enough to prevent extinction.
Hmm right now this seems wrong to me, and also not worth going into in an introductory post. Do you have a sense that your view is commonplace? (eg from talking to many people not involved in AI)
I’m pro-nuclear, but the commonly used EA framing of “nuclear is overregulated” seems net negative more often than not. Clearer Thinking’s new nuclear episode is one of the more epistemically rigorous discussions I’ve heard in EA-adjacent spaces (and Founders Pledge has also done nuanced work).
Nuclear is worth pursuing, but we should argue for it clear-eyed.
Good question — I think it’s mostly untrue as commonly used. It implies regulation is the main bottleneck, but as the podcast lays out, there are likely much better levers for driving down cost. So it’s both misleading and counterproductive as a talking point, even if you’re broadly pro-nuclear (which I and the podcast guest are).
Out of curiosity: Where have EAs argued that “nuclear is overregulated” and, more specifically, where have EAs argued that over-regulation is the only or dominant driver of the cost problem?
It’s probably true that this sometimes happens—especially when EAs outside of climate/energy point to “nuclear is overregulated” as something in line with libertarian / abundance-y priors—but I think those in EA that have done work on nuclear would not subscribe to or spread the view that regulation is the only driver of nuclear problem.
That said, it seems clearly true—and I do think Isabelle agrees with that—that regulatory reform is a necessary component of making nuclear in the West buildable at scale again (alongside many other factors, such as sustained political will, technological progress, re-established supply chains, valuing clean firm power for its attributes, etc).
Good question. I agree: people in EA who’ve actually worked on nuclear don’t usually claim over-regulation is the only or even dominant driver of the cost/buildout problem.
What I’m reacting to is more the “hot take” version that shows up in EA-adjacent podcasts — often as an analogy when people talk about AI policy: “look at nuclear, it got over-regulated and basically died, so don’t do that to AI.” In that context it’s not argued carefully, it’s just used as a rhetorical example, and (to me) it’s a pretty lossy / misleading compression of what’s going on.
So I’m not trying to call out serious nuclear work in EA — I’m mostly sharing the Clearer Thinking episode as a good “orientation reset” because it keeps pointing back to what the binding constraints plausibly are, with regulation as one (maybe not even the main) piece of a complex situation.
Also possible I’m misremembering some of the specific instances — I haven’t kept notes — but I’ve heard the framing enough that it started to rub me the wrong way.
And I’m genuinely curious where you land on the “regulatory reform is necessary” point: do you think the key thing is removing regulation, changing it, or adding policy/market design (e.g. electricity market reform / stable revenue mechanisms / valuing clean firm power)? I’m currently leaning toward “markets/revenue model is the real lever”, but I’m not confident.
One thing I loved reading was a model of Sweden’s total system cost with vs without nuclear (incl. stuff like transmission build-out). It suggested fairly similar overall cost in both worlds — but the nuclear-heavy system leaned more on established tech (less batteries, etc., and I don’t remember if demand response was included).
My read is that the real challenge is: even if total system costs are comparable, how do you actually allocate those costs and rewards in something resembling a market so the “good” system gets built? (Unless you go much more “total state-owned super regulated” and basically nationalising the whole thing.)
What I’m reacting to is more the “hot take” version that shows up in EA-adjacent podcasts — often as an analogy when people talk about AI policy: “look at nuclear, it got over-regulated and basically died, so don’t do that to AI.” In that context it’s not argued carefully, it’s just used as a rhetorical example, and (to me) it’s a pretty lossy / misleading compression of what’s going on.
I agree it’s a bit lossy and sometimes reflexive (this is what I meant with relying on libertarian priors), but I am still confused about your argument.
Because the argument you criticize is an historical one (“nuclear over regulation killed nuclear”) which is different from “now we need many steps and there are different strategies to make nuclear more competitive again”.
I think it is basically correct that over-regulation played a huge part in making nuclear uncompetitive and I don’t think that Isabelle or others knowing the history of nuclear energy would disagree with that, even if it might be a bit overglossed / stylized (obviously, it is not the only thing).
Ah, now I see—thanks for clarifying. Yes historically I do not know how much each set-back to nuclear mattered. I can see that e.g. constantly changing regulation, for example during builds (which I think Isabelle actually mentioned) could cause a significant hurdle for continuing build-out. Here I would defer to other experts like you and Isabelle.
Porting this over to “we might over regulate AI too”, I am realizing it is actually unclear to me whether people who use the “nuclear is over regulated” example means the literal same “historical” thing could happen to AI: --- We put in place constantly changing regulation on AI —This causes large training runs to be stopped mid way —In the end this makes AI uncompetitive with humans and the AI never really takes off —Removing regulation is not able to kick start the industry, as talent has left and we no longer know how to build large language models cost effectively
Writing this I still think I stand by my point that there are much better examples in terms of regulation holding progress back (speeding up vaccine development actually being such an EA cause area, human challenge trials etc.). I can lay out the arguments for why this is so if helpful. But it is basically something like “there is probably much more path dependency in nuclear compared to AI or pharma”.
U.S. Politics should be a main focus of US EAs right now. In the past year alone, every major EA cause area has been greatly hurt or bottlenecked by Trump. $40 billion in global health and international development funds was lost when USAID shut down, which some researchers project could lead to 14 million more deaths by 2030. Trump has signed an Executive Order that aims to block states from creating their own AI regulations, and has allowed our most powerful chips to be exported to China. Trump has withdrawn funding from, and U.S. support for, international governance bodies like the United Nations and the World Health Organization, thereby removing the world’s most influential country from the collaborative efforts necessary to combat climate change and global pandemics. Most recently, the administration even changed nutritional guidelines, encouraging Americans to eat more animal protein than ever, which could drive more demand for unethically-produced animal products. In addition to all of this, Trump has continuously acted undemocratically, brazenly breaking norms and laws meant to protect us from autocracy and dictatorship. This goes to show just how determinative U.S. politics is to our successes and our failures.
I agree. Basically anyone not in a politically sensitive role (this category is broader than it might intuitively seem) should be looking to make large donations in this area now and others should be reaching out to EAs focused on US politics if they feel well equipped to run or contribute to a high leverage project.
Unfortunately there is no AMF/GiveDirectly for politics and most things you can donate too are very poorly leveraged. Likewise it is hard to both scope a leveraged project and execute well on it. I know of one general exception at the moment which I’m happy to recommend privately.
I’m also happy to speak to anyone who intends to devote considerable money or work resources to this and pass them along to the people doing the best work here if that makes sense.
On the bright side, we might end up getting an AI pause out of this, if the Netherlands wakes up and decides that it no longer wants to help supply chips for advanced AI which could either be (a) misaligned or (b) controlled by Trump. See previous discussion, protest. I reckon this moment represents a strong opportunity for Dutch EAs concerned with AI risks. Maybe get a TV interview where you explain how ASML is supplying chips to the US, then explain AI risk, etc.
In terms of red-teaming my own suggestion, I am somewhat worried about further politicizing the issue of AI / highlighting national rivalries. Seems best to push for symmetric restrictions on China—they are directly supplying materials to Russia for its war in Ukraine, after all. Eliezer Yudkowsky could be an interesting person to contact for red-teaming purposes, since he’s strongly in favor of an AI pause, but also seems to resist any “international rivalry” framing of AI risk concerns?
I think by the nature of how the EA Forum works, any proposed solution is likely to be more controversial than a generic “someone should do something about US politics” message. So any proposed solution will get at least a few early downvotes, causing low visibility. EAs want to upvote things which feel official and authoritative. They usually seem uninterested in improvisational brainstorming in response to an evolving situation. This will cause a paradoxical result where despite the “someone should do something about US politics” talk, proposing solutions will feel like a waste of time.
Maybe it would be good to create a dedicated brainstorming thread to try and mitigate this a little bit.
More good news! Norwegian meat industry announced that they will stop using fast-growing chicken breeds by the end of 2027. These breeds are source of immense suffering due to the toll such rapid growth takes on animal’s body.
This will be the first country to stop using them.
Reminder: claim tax relief on charitable donations (UK PAYE taxpayers)
If you:
Pay the higher rate of tax in the UK (earn over £50,271 or £43,663 in Scotland)
Don’t fill in a Self Assessment tax return (you pay tax automatically via PAYE)
Made donations that you claimed Gift Aid on in this or any of the previous 4 tax years
You can use this HMRC link to tell HMRC how much you’ve donated excluding Gift Aid and claim back the difference. More details in this evergreen post.
Practical tip
I set my Giving What We Can pledge tracking to run from 6 April to 5 April, which matches the UK tax year. That makes it easy to report the correct annual figure to HMRC.
Important notes / assumptions
This does not apply if you donate through Give As You Earn (the relief is already applied).
If you’re claiming relief on £10,000 or more of donations, you must also tell HMRC:
The European Parliament recently submitted a parliamentary question on wild animal welfare! The question focuses on human caused wild animal suffering and such questions generally don’t have policy implications—but still, was surprised to see this topic being taken up in policy discourse.
Are there any signs of governments beginning to do serious planning for the need for Universal Basic Income (UBI) or negative income tax...it feels like there’s a real lack of urgency/rigour in policy engagement within government circles. The concept has obviously had its high-level advocates a la Altman but it still feels incredibly distant as any form of reality.
Meanwhile the impact is being seen in job markets right now—in the UK graduate job opening have plummeted in the last 12 months. People I know are having a hard enough time finding jobs with elite academic backgrounds—let alone the vast majority of people who went to average universities. This is happening today—before there’s any consensus of arrival of AGI and widely recognised mass displacement in mid-career job markets. Impact is happening now, but preparation for major policy intervention in current fiscal scenarios seems really far off. If governments do view the risk of major employment market disruption as a realistic possibility (which I believe in many cases they do) are they planning for interventions behind the scene? Or do they view the problem as too big to address until it arrives...viewing rapid response > careful planning in the way the COVID emergency fiscal interventions emerged.
Would be really interested to hear of any good examples of serious thinking/preparation of how some form of UBI could be planned for (logistically and fiscally) in the near time 5 year horizon.
Hmm, I think that’s not the right framing for this. UBI is just not settled as a universally good idea in academic or political circles (sorry, no definitive citation for this), let alone that there’s an urgent unemployment crisis (the statistic I think you’re citing is for job openings, not actual employment rates) or that such a crisis, if it did exist, has structural causes which could be expected to increase (i.e. it might not be AI, nor should we necessarily expect AI to become orders of magnitude more advanced in the next 5 years; there was plausibly a very different shock to the global economic system beginning around Liberation Day, 2025).
Thanks for your take—I always appreciate slightly less doom and gloom perspectives.
On your point that there’s not an imminent unemployment crisis and what impacts we are seeing may be due to other factors. Firstly I think it’s inevitable that the direct causes of disruption to the labour market are going to be multifaceted given the current trajectory of global markets (de-coupling, de-globalisation etc.)whatever happens moving forward. In the UK specifically part of the issue is minimum wage has been increased, making employers less inclined to hire grads (and yes it’s grad openings which have halved, but we’re simultaneously seeing 18-25 unemployment rising...not yet anywhere close to 08 levels but gradually increasing) - but this and other factors aren’t independent of AI, but rather accelerate it’s impacts (i.e. right now AI probably can’t replace all grad jobs, but CEOs are more willing to experiment, explore, and begin premature replacement because of rising grad costs and higher tax rates...but now those jobs are gone they’re off the market. The more broad market shocks there are, the more businesses will look to AI to cut costs)
Secondly I definitely take your point that AI may not become orders of magnitude more capable in the next five years and that ‘AI is coming for our jobs’ could be overblown. I suppose my thinking is—it might. Even if there is say, a 30% chance that in the next 5 years even 30-40% of white collar jobs get replaced...that feels like a massive shock to high-income countries way of life, and a major shift in the social contract for the generation of kids who have got into massive debt for university courses only to find substantially reduced market opportunities. That requires substantial state action.
And that’s the more conservative end of things; if there’s any reasonable chance that within even the next 10-15 years AI becomes capable of replacing a higher percentile of the total job market, surely we need some form of effective way of ensuring people with no job opportunities have some degree of resources. Even if you’re correct that it’s fairly unlikely—to avoid major social instability in the event of such a scenario it feels prudent that governments would be doing serious planning for the potential (just as they would for pandemics or war gaming no matter how unlikely the scenario)
And finally—you may well be right that UBI is not accepted as a good idea in political circles. I’m not wedded to that particular approach, and have heard ideas about negative income tax floating around. But in a scenario where there simply isn’t sufficient available employment for a functioning labour market and allocation of basic resources to all citizens—I’d like to think that governments are putting serious thinking and planning into how we ensure society continues to function, and what the social contract might become, and not that we’re waiting until that reality comes to pass to plan for an appropriate response.
Technical Alignment Research Accelerator (TARA) applications close today!
Last chance to apply to join the 14-week, remotely taught, in-person run program (based on the ARENA curriculum) designed to accelerate APAC talent towards meaningful technical AI safety research.
TARA is built for you to learn around full-time work or study by attending meetings in your home city on Saturdays and doing independent study throughout the week. Finish the program with a project to add to your portfolio, key technical AI safety skills, and connections across APAC.
See this post for more information and apply through our website here.
Super sceptical probably very highly intractable thought that I haven’t done any research on: There seem to be a lot of reasons to think we might be living in a simulation besides just Nick Bostrom’s simulation argument, like:
All the fundamental constants and properties of the universe are perfectly suited to the emergence of sentient life. This could be explained by the Anthropic principle, or it could be explained by us living in a simulation that has been designed for us.
The Fermi Paradox: there don’t seem to be any other civilizations in the observable universe. There are many explanations for the Fermi Paradox, but one additional explanation might be that whoever is simulating the universe created it for us, or they don’t care about other civilizations, so haven’t simulated them.
We seem to be really early on in human history. Only about 60 billion people have ever lived IIRC but we expect many trillions to live in the future. This can be explained by the Doomsday argument—that in fact we are in the time in human history where most people will live because we will soon go extinct. However, this phenomenon can also be explained by us living in a simulation—see next point.
Not only are we really early, but we seem to be living at a pivotal moment in human history that is super interesting. We are about to create intelligence greater than ourselves, expand into space, or probably all die. Like if any time in history were to be simulated, I think there’s a high likelihood it would be now.
If I was pushed into a corner, I might say the probability we are living in a simulation is like 60%, where most evidence seems to point towards us being in a simulation. However, the doubt comes from the high probability that I’m just thinking about this all wrong—like, of course I can come up with a motivation for a simulation to explain any feature of the universe… it would be hard to find something that doesn’t line up with an explanation that the simulators just being interested in that particular thing. But in any case, that’s still a really high probability of everyone I love potentially not being sentient or even real (fingers crossed we’re all in the simulation together). Also, being in a simulation would change our fundamental assumptions about the universe and life, and it be really weird if that had no impact on moral decision-making.
But everyone I talk to seems to have a relaxed approach to it, like it’s impossible to make any progress on this and that it couldn’t possibly be decision-relevant. But really, how many people have worked on figuring it out with a longtermist or EA-mindset? Some reasons it might be decision-relevant:
We may be able to infer from the nature of the universe and the natural problems ahead of us what the simulators are looking to understand or gain from the simulation (or at least we might attach percentage likelihoods to different goals). Maybe there are good arguments to aim to please the simulators, or not. Maybe we end the simulation if there are end-conditions?
Being in a simulation gives some weight to the probability that aliens exist (they probably have a lower probability of existing if we are in a simulation), which helps with long-term grand planning. Like, we wouldn’t need to worry about integrating defenses against alien attacks or engaging in acausal trade with aliens.
We can disregard arguments like The Doomsday Argument, lowering our p(doom)
Some questions I’d ask is:
How much effort have we put into figuring out if there is something decision-relevant to do about this from a moral impact perspective? How much effort should we put into this?
How much effort has gone into figuring out if we are, in fact, in a simulation, using empiricism? What might we expect to see in a simulated universe vs a real world? How we can we search for and detect that?
Overall, this does sounds nuts to me and it probably shouldn’t go further than this quick take, but I do feel like there could be something here, and it’s probably worth a bit more attention than I think it has gotten (like 1 person doing a proper research project on it at least). Lots of other stuff sounded crazy but now has significant work and (arguably) great progress, like trying to help people billions of years in the future, working on problems associated with digital sentience, and addressing wild animal welfare. There could be something here and I’d be interested in hearing thoughts (especially a good counterargument to working on this so I don’t have to think about it anymore) or learning about past efforts.
All the things you mentioned aren’t uniquely evidence for the simulation hypothesis but are equally evidence for a number of other hypotheses, such as the existence of a supernatural, personal God who designed and created the universe. (There are endless variations on this hypothesis, and we could come up endless more.)
The fine-tuning argument is a common argument for the existence of a supernatural, personal God. The appearance of fine-tuning supports this conclusion equally as well it supports the simulation hypothesis.
Some young Earth creationists believe that dinosaur fossils and other evidence of an old Earth were intentionally put there by God to test people’s faith. You might also think that God tests our faith in other ways, or plays tricks, or gets easily bored, and creates the appearance of a long history or a distant future that isn’t really there. (I also think it’s just not true that this is the most interesting point in history.)
Similarly, the book of Genesis says that God created humans in his image. Maybe he didn’t create aliens with high-tech civilizations because he’s only interested in beings with high technology made in his image.
It might not be God who is doing this, but in fact an evil demon, as Descartes famously discussed in his Meditations around 400 years ago. Or it could be some kind of trickster deity like Loki who is neither fully good or fully evil. There are endless ideas that would slot in equally well to replace the simulation hypothesis.
You might think the simulation hypothesis is preferable because it’s a naturalistic hypothesis and these are supernatural hypotheses. But this is wrong, the simulation hypothesis is a supernatural hypothesis. If there are simulators, the reality they live in is stipulated to have different fundamental laws of nature, such as the laws of physics, than exist in what we perceive to be the universe. For example, in the simulators’ reality, maybe the fundamental relationship between consciousness and physical phenomena such as matter, energy, space, time, and physical forces is such that consciousness can directly, automatically shape physical phenomena to its will. If we observed this happening in our universe, we would describe this as magic or a miracle.
Whether you call them “simulators” or “God” or an “evil demon” or “Loki”, and whether you call it a “simulation” or an “illusion” or a “dream”, these are just different surface-level labels for substantially the same idea. If you stipulate laws of nature radically other than the ones we believe we have, what you’re talking about is supernatural.
If you try to assume that the physics and other laws of nature in the simulators’ reality is the same as in our perceived reality, then the simulation argument runs into a logical self-contradiction, as pointed out by the physicist Sean Carroll. Endlessly nested levels of simulation means computation in the original simulators’ reality will run out. Simulations at the bottom of the nested hierarchy, which don’t have enough computation to run still more simulations inside them, will outnumber higher-level simulations. Since the simulation argument says, as one of its key premises, that in our perceived reality we will be able to create simulations of worlds or universes filled with many digital minds, but the simulation hypothesis implies this is actually impossible, then the simulation argument’s conclusion contradicts one of its premises.
There are other strong reasons to reject the simulation argument. Remember that a key premise is that we ourselves or our descendants will want to make simulations. Really? They’ll want to simulate the Holocaust, malaria, tsunamis, cancer, cluster headaches, car crashes, sudden infant death syndrome, and Guantanamo Bay? Why? On our ethical views today, we would not see this as permissible, but rather the most grievous evil. Why would our descendants feel differently?
Less strongly, computation is abundant in the universe but still finite. Why spend computation on creating digital minds inside simulations when there is always a trade-off between doing that and creating digital minds in our universe, i.e. the real world? If we or our descendants think marginally and hold as one of our highest goals to maximize the number of future lives with a good quality of life, using huge amounts of computation on simulations might be seen as going against that goal. Plus, there are endlessly more things we could do with our finite resource of computation, most we can’t imagine today. Where would creating simulations fall on the list?
You can argue that creating simulations would be a small fraction of overall resources. I’m not sure that’s actually true; I haven’t done the math. But just because something is a small fraction of overall resources doesn’t mean it will be likely be done. In an interstellar, transhumanist scenario, our descendants could create a diamond statue of Hatsune Miku the size of the solar system and this would take a tiny percentage of overall resources, but that doesn’t mean it will likely happen. The simulation argument specifically claims that making simulations of early 21st century Earth will interest our descendants more than alternative uses of resources. Why? Maybe they’ll be more interested in a million other things.
Overall, the simulation hypothesis is undisprovable but no more credible than an unlimited number of other undisprovable hypotheses. If something seems nuts, it probably is. Initially, you might not be able to point out the specific logical reasons it’s nuts. But that’s to be expected — the sort of paradoxes and thought experiments that get a lot of attention (that “go viral”, so to speak) are the ones that are hard to immediately counterargue.
Philosophy is replete with oddball ideas that are hard to convincingly refute at first blush. The Chinese Room is a prime example. Another random example is the argument that utilitarianism is compatible with slavery. With enough time and attention, refutations may come. I don’t think one’s inability to immediately articulate the logical counterargument is a sign that an oddball idea is correct. It’s just that thinking takes time and, usually, by the time an oddball idea reaches your desk, it’s proven to be resistant to immediate refutation. So, trust that intuition that something is nuts.
Strong upvoted as that was possibly the most compelling rebuttal to the simulation argument I’ve seen in quite a while, which was refreshing for my peace of mind.
That being said, it mainly targets the idea of a large-scale simulation of our entire world. What about the possibility that the simulation is for a single entity and that the rest of the world is simulated at a lower fidelity? I had the thought that a way to potentially maximize future lives of good quality would be to contain each conscious life in a separate simulation where they live reasonably good lives catered to their preferences, with the apparent rest of the world being virtual. Given, I doubt this conjecture because in my own opinion my life doesn’t seem that great, but it seems plausible at least?
Also, that line about the diamond statue of Hatsune Miku was very, very amusing to this former otaku.
Changing the simulation hypothesis from a simulation of a world full of people to a simulation of an individual throws the simulation argument out the window. Here is how Sean Carroll articulates the first three steps of the simulation argument:
We can easily imagine creating many simulated civilizations.
Things that are that easy to imagine are likely to happen, at least somewhere in the universe.
Therefore, there are probably many civilizations being simulated within the lifetime of our universe. Enough that there are many more simulated people than people like us.
The simulation argument doesn’t apply to you, as an individual. Unless you think that you, personally, are going to create a simulation of a world or an individual — which obviously you’re not.
Changing the simulation hypothesis from a world-scale simulation to an individual-scale simulation also doesn’t change the other arguments against the simulation hypothesis:
The bottoming out argument. This is the one from Sean Carroll. Even if we supposed you, personally, were going to create individual-scale simulations in the future, eventually a nesting cascade of such simulations would exhaust available computation in the top-level universe, i.e. the real universe. The bottom-level simulations within which no further simulations are possible would outnumber higher-level ones. The conclusion of the simulation argument contradicts a necessary premise.[1]
The ethical argument. It would be extremely unethical to imprison an individual in a simulation without their consent, especially a simulation with a significant amount of pain and suffering that the simulators are programming in. Would you create an individual-scale simulation even of an unrealistically pleasant life, let alone a life with significant pain and suffering? If we had the technology to do this today, I think it would be illegal. It would be analogous to false imprisonment, kidnapping, torture, or criminal child abuse (since you are creating this person).
The computational waste argument. The amount of computation required to make an individual-scale simulation would require at least as much computation as creating a digital mind in the real universe. In fact, it would require more computation, since you also to have to simulate the whole world around the individual, not just the individual themselves. If the simulators think marginally, they would prefer to use these resources to create a digital mind in the real universe or put them to some other, better use.
If the point of the simulation is to cater it to the individual’s preferences, we should ask:
a) Why isn’t this actually happening? Why is there so much unnecessary pain and suffering and unpleasantness in every individual’s life? Why simulate the covid-19 pandemic?
b) Why not cater to the individual’s fundamental and overriding preference not to be in a simulation?
c) Why not put these resources toward any number of superior uses that must surely exist?[2]
Perhaps most importantly, changing the simulation hypothesis from world-scale to individual-scale doesn’t change perhaps the most powerful counterargument to the simulation hypothesis:
The unlimited arbitrary, undisprovable hypotheses argument. There is no reason to think the simulation hypothesis makes any more sense or is any more likely to be true than the hypothesis that the world you perceive is an illusion created by an evil demon or a trickster deity like Loki. There are an unlimited number of equally arbitrary and equally unjustified hypotheses of this type that could be generated. In my previous comment, I argued that versions of the simulation hypotheses in which the laws of physics or laws of nature are radically different in the real universe than in the simulation are supernatural hypotheses. Versions of the simulation hypothesis that assume real universe physics is the same as simulation physics suffer from the bottoming out argument and the computational waste argument. So, either way, the simulation hypothesis should be rejected. (Also, whether the simulation has real universe physics or not, the ethical argument applies — another reason to reject it.)
This argument also calls into question why we should think simulation physics is the same as real universe physics, i.e. why we should think the simulation hypothesis makes more sense as a naturalistic hypothesis than a supernatural hypothesis. The simulation hypothesis leans a lot on the idea that humans or post-humans in our hypothetical future will want to create “ancestor simulations”, i.e. realistic simulations of the simulators’ past, which is our present. If there were simulations, why would ancestor simulations be the most common type? Fantasy novels are about equally popular as historical fiction or non-fiction books about history. Would simulations skew toward historical realism significantly more than books currently do? Why not simulate worlds with magic or other supernatural phenomena? (Maybe we should conclude that, since this is more interesting, ghosts probably exist in our simulation. Maybe God is simulated too?) The “ancestor simulation” idea is doing a lot of heavy lifting; it’s not clear that this is in any way a justifiable assumption rather than an arbitrary one. The more I dig into the reasoning behind the simulation hypothesis, the more it feels like Calvinball.[3]
The individual-scale simulation hypothesis also introduces new problems that are unique to it:
Simulation of other minds. If you wanted to build a robot that could perfectly simulate the humans you know best, the underlying software would need to be a digital mind. Since, on the individual-scale simulation hypothesis, you are a digital mind, then the other minds in the simulation — at least the ones you know well — are as real as you are. You could try to argue that these other minds only need to be partially simulated. For example, the mind simulations don’t need to be running when you aren’t observing or interacting with these people. But then why don’t these people report memory gaps? If the answer is that the simulation fills in the gaps with false memories, what process continually generates new false memories? Why would this process be less computationally expensive than just running the simulation normally? (You could also try to say that consciousness is some kind of switch that can be flipped on or off for some simulations but not others. But I can’t think of any theory of consciousness this would be compatible with, and it’s a problem for the individual-scale simulation hypothesis if it just starts making stuff up ad hoc to fit the hypothesis.)
If we decide that at least the people you know well must be fully simulated, in the same way you are, then what about the people they know well? What about the people who they know well know well? If everyone in the world is connected through six degrees of separation or fewer, then it seems like individual-scale simulations are actually impossible and all simulations must be world-scale simulations.
Abandoning the simulation of history at large scale. Individual-scale simulations don’t provide the same informational value that world-scale simulations might. When people talk about why “ancestor simulations” would supposedly be valuable or desired, they usually appeal to the notion of simulating historical events on a large scale. This obviously wouldn’t apply to individual-scale simulations. To the extent credence toward the simulation hypothesis depends on this, an individual-scale simulation hypothesis may be even less credible than a world-scale simulation hypothesis.
The Wikipedia page on the simulation hypothesis notes that it’s a contemporary twist on a centuries-old if not millennia-old idea. We’ve replaced dreams and evil demons with computers, but the underlying idea is largely the same. The reasons to reject it are largely the same, although the simulation argument has some unique weaknesses. That page is a good resource for finding still more arguments against the simulation hypothesis.[4]
Carroll, who is a physicist and cosmologist, also criticizes the anthropic reasoning of the simulation argument. I recommend reading his post, it’s short and well-written.
You could try to argue that, despite society’s best efforts, it will be impossible to supress a large number of simulations from being created. Pursuing this line of argument re quires speculating about the specific details of a distant, transhuman or post-human future. Would an individual creating a simulation be more like an individual today operating a meth lab or launching a nuclear ICBM? I’m not sure we can know the answer to this question. If dangerous or banned technologies can’t be controlled, what does this say about existential risk? Will far future, post-human terrorists be able to deploy doomsday devices? If so, that would undermine the simulation argument. (Will post-humans even have the desire to be terrorists, or is that a defect of humanity?)
Related to this are various arguments that the simulation argument is self-defeating. We infer things about the real universe from our perceived universe. We then conclude that our perceived universe is a simulation. But, if it is, this undermines our ability to infer anything about the real universe from our perceived universe. In fact, this undermines the inference that our perceived universe is a simulation within a real universe. So, the simulation argument defeats itself.
In addition to all the above, I would be curious to hear empirical, scientific arguments about the amount of computation that might be required for world-scale simulations, which would be partly applicable to individual-scale simulations. Obviously, our universe can’t run a full-scale, one-to-one simulation of our universe with perfect fidelity — that would require more computation, matter, and energy than our universe has. If you only simulate the solar system with perfect fidelity, you can pare that down a lot. You can make other assumptions to pare down the computation required. It’s much less important than all the arguments and considerations described above, but if we get a better understanding of approximately how difficult or costly a world-scale simulation might be, that could help put some considerations like computational waste in perspective.
I would not describe the finetuning argument and the Fermi paradox as strong evidence in favour of the simulation hypothesis. I would instead say that they are open questions for which a lot of different explanations have been proposed, with the simulation offering only one of many possible resolutions.
As to the “importance” argument, we shouldn’t count speculative future events as evidence of the importance of now. I would say the mid-20th century was more important than today, because that’s the closest we ever got to nuclear annihilation (plus like, WW2).
I’ve thought about this a lot too. My general response is that it is very hard to see what one could do differently at a moment to moment level even if we were in a simulation. While it’s possible that you or I are alone in the simulation, we can’t, realistically, know this. We can’t know with much certainty that the apparently sentient beings who share our world aren’t actually sentient. And so, even if they are part of the simulation, we still have a moral duty to treat them well, on the chance they are capable of subjective experiences and can suffer or feel happiness (assuming you’re a Utilitarian), or have rights/autonomy to be respected, etc.
We also have no idea who the simulators are and what purpose they have for the simulation. For all we know, we are petri dish for some aliens, or a sitcom for our descendents, or a way for people’s minds on colony ships travelling to distant galaxies to spend their time while in physical stasis. Odds are, if the simulators are real, they’ll just make us forget about whatever if we finally figure it out, so they can continue it for whatever reasons.
Given all this, I don’t see the point in trying to defy them or doing really anything differently than what you’d do if this was the ground truth reality. Trying to do something like attempting to escape the simulation would most likely fail AND risk getting you needlessly hurt in this world in the process.
If we’re alone in the sim, then it doesn’t matter what we do anyway, so I focus on the possibility that we aren’t alone, and everything we do does, in fact, matter. Give it the benefit of the doubt.
At least, that’s the way I see things right now. Your mileage may vary.
I made this simple high-level diagram of critical longtermist “root factors”, “ultimate scenarios”, and “ultimate outcomes”, focusing on the impact of AI during the TAI transition.
This involved some adjustments to standard longtermist language. “Accident Risk” → “AI Takeover ”Misuse Risk” → “Human-Caused Catastrophe” “Systemic Risk” → This is spit up into a few modules, focusing on “Long-term Lock-in”, which I assume is the main threat.
You can read interact with it here, where there are (AI-generated) descriptions and pages for things.
Curious to get any feedback!
I’d love it if there could eventually be one or a few well-accepted and high-quality assortments like this. Right now some of the common longtermist concepts seem fairly unorganized and messy to me.
---
Reservations:
This is an early draft. There’s definitely parts I find inelegant. I’ve played with the final nodes instead being things like, “Pre-transition Catastrophe Risk” and “Post-Transition Expected Value”, for instance. I didn’t include a node for “Pre-transition value”; I think this can be added on, but would involve some complexity that didn’t seem worth it at this stage. The lines between nodes were mostly generated by Claude and could use more work.
This also heavily caters to the preferences and biases of the longtermist community, specifically some of the AI safety crowd.
Just finding about about this & crux website. So cool. Would love to see something like this for charity ranking (if it isn’t already somewhere on the site).
Don’t you need a philosophy axioms layer between outputs and outcomes? Existential catastrophe definitions seems to be assuming a lot of things.
Would also need to think harder about why/in what context i’m using this but “governance” being a subcomponent when it’s arguably more important/ can control literally everything else at the top level seems wrong.
>Would love to see something like this for charity ranking (if it isn’t already somewhere on the site). I could definitely see this being done in the future.
>Don’t you need a philosophy axioms layer between outputs and outcomes? I’m nervous that this can get overwhelming quickly. I like the idea of starting with things that are clearly decision-relevant to the certain audience the website has, then expanding from there. Am open to ideas on better / more scalable approaches!
>”governance” being a subcomponent when it’s arguably more important/ can control literally everything else at the top level seems wrong. Thanks! I’ll keep in mind. I’d flag that this is an extremely high-level diagram, meant more to be broad and elegant than to flag which nodes are most important. Many critical things are “just subcomponents”. I’d like to make further diagrams on many of the different smaller nodes.
According to someone I chatted to at a party (not normally the optimal way to identify top new cause areas!) fungi might be a worrying new source of pandemics because of climate change.
Apparently this is because thermal barriers prevented fungi from infecting humans, but because fungi are adapting to higher temperatures, they are now better able to overcome those barriers. This article has a bit more on this:
Purportedly, this is even more scary than a pathogen you can catch from people, because you can catch this from the soil.
I suspect that if this were, in fact, the case, I would have heard about it sooner. Interested to hear comments from people who know more about it than me, or have more capacity than me to read up about it a bit.
When people ask me “What is one area or issue you wish people paid more attention to in global health?”, I almost always say fungal diseases.
I co-authored some reports on fungal infections (e.g., this one), and my impression is that it is indeed very plausible and well-recognized by experts that fungal infections will rise in a major way as a result of climate change, though I have not seen any guesses / estimates of how large the additional burden could be.
I think the more important point is that, regardless of climate change, fungal diseases are a massive disease burden source already. Fungal disease-related deaths are plausibly on the order of ~2M/year, likely more, and it is possible that DALYs are in a similar ballpark as TB, malaria, and HIV (though again unclear, because fungal diseases aren’t even comprehensively included in IHME’s global burden of disease estimates yet).
It is also incredibly neglected, to an extent that I find almost unbelievable. Though this has recently improved a bit, with more attention / funding from the Wellcome Trust coming in.
I think one reason that people aren’t jumping on fungal diseases despite high importance and neglectedness is that tractability is tricky. Fungal disease treatments are often not very effective, expensive, difficult to administer, and have lots of side effects. Also, there are LOTS of different fungal diseases, that all affect different populations, manifest differently, and require different diagnostics/treatment. So there isn’t really an easy one-size-fits-all solution here.
I do not find it surprising that you haven’t heard about it. Lots of people I know haven’t, and there are several reasons for this that are too long to explain here (though this article might help).
Maybe helpful for you to know that Coefficient Giving have done internal research on fungal diseases (they also commissioned our work on this topic), so they might have more thoughts on this.
Hi Jenny, very interesting, thank you. What was the response of CG to your report, and do you know if they are planning to invest more resources towards this potential cause area?
I’m not able to comment on CG’s reaction to the report, as those discussions are confidential.
What I can say is that they are still exploring this area internally (given that they commissioned us to do more work related to fungal diseases recently (see here)).
I’m not aware of any specific grantmaking decisions or commitments at this stage.
Thanks this is super interesting and definitely concerning.
FWIW within the non-EA Global Health Community this has been a topic of conversation for the last 3-4 years. It is potential threat, but still seems like a super low percentage Xish-risk, because...
a) We haven’t actually seen anything terribly dangerous happen yet b) Antifungal medications are there, and if there was a super-dangerous-mass fungal threat I suspect we could make better ones pretty quicksmart. But yes this is far from guaranteed.
As a side note there are already plenty of pathogens we catch from the soil like anthrax and tetanus, as well as worms like hookworm!
The person I spoke to at the party said that he knew somebody who had a fungal infection and was likely to die from it.
I don’t know much about antifungals, but I infer from his comment that we don’t have enough antifungals to cover all of the potential fungal infections.
To my knowledge, there are a few (not actually that many) existing antifungals, but as I commented above, they mostly aren’t very good, and in several deadly fungal infections they are almost pointless.
Also, when a new fungal pathogen comes out, it might be harmless, or it might be big trouble, nobody can predict that. A good example I’ve seen mentioned a few times is Candida Auris (pretty serious and often deadly fungal infection) that emerged in 2009 independently in several regions of the world, pretty much out of nowhere. And the scary thing is that it was drug-resistant from the start! I think researchers aren’t quite sure why it emerged, but it could be related to climate change.
The idea of fungi evolving to infect humans and resulting in apocalypse underpins the premise of the famous game and TV series “The Last of Us”
Given the series’ critical acclaim and popularity, I wonder if it also demonstrates potential for engaging the public with this topic through mainstream popular media.
I was wondering if anyone was going to mention that. There was a lot of media buzz about whether the events of the show could really happen at the time of its airing. This piece by Yale is supposed to sound reassuring, but it just… doesn’t. :/
Among other things, the natural-atrocity take on zombies is what got me in love with the TV series; depressed by them but interested in this new aesthetic of dangerous nature globally killing human civilization, think overgrown moss on broken subways. I can indeed see things like it motivating EA people to prevent such things, very much involved with visions of such a world. 🪸
This idea gets discussed in infectious disease circles, but it is often framed more dramatically than the evidence supports. Fungi adapting to higher temperatures is real, Candida auris is a good example, but most fungi still struggle to survive in the human body and spread efficiently between people. Soil exposure already exists today, yet serious fungal infections remain rare and mostly affect immunocompromised individuals. It is a risk worth monitoring, not a hidden pandemic waiting to explode, which is likely why it has not triggered broader alarms outside specialist research.
What are people’s favorite arguments/articles/essays trying to lay out the simplest possible case for AI risk/danger?
Every single argument for AI danger/risk/safety I’ve seen seems to overcomplicate things. Either they have too many extraneous details, or they appeal to overly complex analogies, or they seem to spend much of their time responding to insider debates.
I might want to try my hand at writing the simplest possible argument that is still rigorous and clear, without being trapped by common pitfalls. To do that, I want to quickly survey the field so I can learn from the best existing work as well as avoid the mistakes they make.
Max Tegmark explains it best I think. Very clear and compelling and you don’t need any technical background to understand what he’s saying.
I believe his third or maybe it was second appearance on Lex Fridman’s podcast where I first heard his strongest arguments, although those are quite long with extraneous content, here is a version that is just the arguments. His solutions are somewhat specific, but overall his explanation is very good I think:
“Postrel does describe five characteristics of ‘dynamist rules’:
As an overview, dynamist rules:
Allow individuals (including groups of individuals) to act on their own knowledge.
Apply to simple, generic units and allow them to combine in many different ways.
Permit credible, understandable, enduring, and enforceable commitments.
Protect criticism, competition, and feedback.
Establish a framework within which people can create nested, competing frameworks of more specific rules.
I see some overlap with existing ideas in AI policy:
Transparency, everyone’sfavorite consensus recommendation, fits well into a dynamist worldview. It helps with Postrel’s #1 (giving individuals access to better information that they can act on as they choose), #3 (facilitating commitments), and #4 (facilitating criticism and feedback). Ditto whistleblower protections.
Supporting the development of a third-party audit ecosystem also fits—it helps create and enforce credible commitments, per #3, and could be considered a kind of nestable framework, per #5.
The value of open models in driving decentralized use, testing, and research is obvious through a dynamist lens, and jibes with #1 and #4. (I do think there should be some precautionary friction before releasing frontier models openly, but that’s a narrow exception to the broader value of open source AI resources.)
Another good bet is differential technological development, aka defensive accelerationism—proactively building technologies that help manage challenges posed by other technologies—though I can’t easily map it onto Postrel’s five characteristics. I’d be glad to hear readers’ ideas for other productive directions to push in.”
At the NIH, Jay Bhattacharya did a lot to reduce animal experimentation and thus reduce animal suffering. As far as ChatGPT can tell, this seems to be completely ignored by the Effective Altruism forum.
Marty Makary’s FDA is also taking it’s steps to reduce the need of animal testing for FDA approvals.
Is this simply, because Effective Altruists don’t like the Trump administration so they can’t take the win of MAHA bringing contrarians into control of health policy that do things like caring more about reducing animal suffering and fighting the replication crisis?
EAs concerned about animal welfare have typically focused on farmed animals, as opposed to animal testing, because of the much larger scale of the suffering
EAs mostly haven’t heard of it.
Maybe some EAs have heard about it, but they don’t think it is worth the effort to write a post about it.
But tribalistic explanations could be a factor too (e.g. MAHA has anti-science vibes, and EAs like to stay on the pro-science side).
(This is probably not the most constructive feedback, but my initial reaction to this short form was that it felt like a right-wing analog of left-wing “Why don’t the EAs tweet about Gaza?”-style criticisms).
Dwarkesh (of the famed podcast) recently posted a call for new guest scouts. Given how influential his podcast is likely to be in shaping discourse around transformative AI (among other important things), this seems worth flagging and applying for (at least, for students or early career researchers in bio, AI, history, econ, math, physics, AI that have a few extra hours a week).
The role is remote, pays ~$100/hour, and expects ~5–10 hours/week. He’s looking for people who are deeply plugged into a field (e.g. grad students, postdocs, or practitioners) with high taste. Beyond scouting guests, the role also involves helping assemble curricula so he can rapidly get up to speed before interviews.
This is a solid opportunity for people who already live inside a domain and enjoy synthesis more than spotlight. The pay reflects the expectation of taste and context, not just surface level research. Helping shape guest selection and prep indirectly shapes the conversation, which matters given the reach of the podcast. For the right grad student or practitioner, this is leverage and learning at the same time.
I’d be keen for great people to apply to the Deputy Director role ($180-210k/y, remote) at the Mirror Biology Dialogues Fund. I spoke a bit about mirror bacteria on the 80k podcast, James Smith also had a recent episode on it. I generally think this is among the most important roles in the biosecurity space and I’ve been working with the MBDF team for a while now and am impressed by what they’re getting done.
People might be surprised to hear that I put ballpark 1% p(doom) on mirror bacteria alone at the start of 2024. That risk has been cut substantially by the scientific consensus that has formed against building it since then, but there is some remaining risk that the boundaries are not drawn far enough from the brink that bad actors could access it. Having a great person in this role would help ensure a wider safety margin.
This role sounds important precisely because the risk is no longer theoretical but also not fully contained. Cutting risk through consensus helps, but it does not replace strong governance and clear red lines. A Deputy Director who understands both the technical details and the incentives of bad actors can close gaps that policy statements cannot. If mirror bacteria still sit close enough to misuse, staffing quality becomes a real safety control, not just an admin decision.
I notice the ‘guiding principles’ in the introductory essay on effectivealtruism.org have been changed. It used to list: prioritisation, impartial altruism, open truthseeking, and a collaborative spirit. It now lists: scope sensitivity, impartiality, scout mindset, and recognition of trade-offs.
As far as I’m aware, this change wasn’t signalled. I understand lots of work has been recently done to improve the messaging on effectivealtruism.org—which is great! -- but it feels a bit weird for ‘guiding principles’ to have been changed without any discussion or notice.
As far as I understand, back in 2017 a set of principles were chosen through a somewhat deliberative process, and then organisations were invited to endorse them. This feels like a more appropriate process for such a change.
I can’t speak for the choice of principles themselves, but can give some context on why the change was made in the intro essay (and clarify a mistake I made).
There are different versions of EA principles online. One version was CEA’s guiding principles you mention from 2017, and had endorsement from some other organisations. CEA added a new intro essay to effectivealtruism.org in 2022, with a different variation of a list of principles and Ben Todd as a main author: you can read the Forum post announcing the new essay here, and see the archived version here.
After Zach’s post outlining the set of principles that are core to CEA’s principles-first approach (that had existed for some time and been published on the CEA website, but not on effectivealtruism.org), we updated them in the intro essay for consistency. I also find Zach’s footnotehelpful context:
“This list of principles isn’t totally exhaustive. For example,CEA’s website lists a number of “other principles and tools” below these core four principles and “What is Effective Altruism?” lists principles like “collaborative spirit”, but many of them seem to be ancillary or downstream of the core principles. There are also other principles likeintegrity that seem both true and extremely important to me, but also seem to be less unique to EA compared to the four core principles (e.g. I think many other communities would also embrace integrity as a principle).”
I also want to say thanks to you (and @Kestrel🔸) for pointing out that collaborative spirit is no longer mentioned, that was actually a mistake! When we updated the principles in the essay we still wanted to reference collaborative spirit, but I left that paragraph out by mistake. I’ve now added it:
“It’s often possible to achieve more by working together, and doing this effectively requires high standards of honesty, integrity, and compassion. Effective altruism does not mean supporting ‘ends justify the means’ reasoning, but rather is about being a good citizen, while ambitiously working toward a better world.”
Last week I had a discussion about the core principles with someone at our EA office in Amsterdam. She also liked “collaborative spirit”. I remembered this discussion and decided to check it again and see that you decided to add this in the intro essay. That’s great! Shouldn’t it then also be added on the “core principles” page? (Or am I overlooking something?)
I think that infighting is a major reason why EA and many similar movements achieve far less than they could. I really like when EA is a place where people with very different beliefs who prioritise very different projects can collaborate productively, and I think it’s a major reason for its success. It seems more unique/specific than acknodwledging tradeoffs, more important to have explicitly written as a core value to prevent the community from drifting away from it, and a great value proposition.
As James, I also found it weird that what had become a canonical definition of EA was changed without a heads-up to its community.
In any case, thank you so much for all your work, and I’m grateful that thanks to you it survives as a paragraph in the essay.
It’s really important to me, as I can sometimes find that the (non-EA) charity and government world is a bunch of status-based competition over funding pots that encourages flattery and truth distortions and bitterness.
And, ok, EA can be like that as well, but ideally it isn’t—ideally we’d be totally happy for our pet project to get cancelled and the money reallocated to doing a similar thing more efficiently. And also to uphold the people this happens to, recognising their inherent worth as community members and collaborators.
I am glad to see the term “truthseeking” go. The problems with this term: 1) it has never been clearly defined by anyone anywhere, 2) people seem to disagree about what it means, and 3) the main way it seems to be used in practice on the EA Forum is as an accusation made against someone else — but due to (1) and (2), it’s typically not clear what, exactly, the accusation is. “Scout mindset” is much more clearly defined, so it’s a good replacement. (I don’t particularly love that term, personally, but that’s neither here nor there.)
Scope sensitivity seems like a good replacement for prioritization, no? I guess scope sensitivity and recognition of trade-offs together have replaced prioritization. That seems fine to me. What do you think?
Impartial altruism and impartiality sound like the same thing. So, that’s fine.
I think Kestrel is right that the only clear substantive change is collaborative spirit was dropped. Is that a good guiding principle? Could it also be substituted with something a bit clearer or better?
I don’t have a super strong view on which set of guiding principles is better—I just thought it was odd for them to be changed in this way.
If pushed, I prefer the old set, and a significant part of that preference stems from the amount of jargon in the new set. My ideal would perhaps be a combination of the old set and the 2017 set.
Expanding our moral circle
We work to overcome our natural tendency to care most about those closest to us. This means taking seriously the interests of distant strangers, future generations, and nonhuman animals—anyone whose wellbeing we can affect through our choices. We continuously question the boundaries we place around moral consideration, and we’re willing to help wherever we can do the most good, not just where helping feels most natural or comfortable.
Prioritisation
We do the hard work of choosing where to focus our limited time, money, and attention. This means being willing to say “this is good, but not the best use of marginal resources”—and actually following through, even when it means disappointing people or turning down appealing opportunities. We resist scope creep and don’t let personal preferences override our considered judgments about where we can have the most impact.
Scientific mindset
We treat our beliefs as hypotheses to be tested rather than conclusions to be defended. This means actively seeking disconfirming evidence, updating based on data, and maintaining genuine uncertainty about what we don’t yet know. We acknowledge the limits of our evidence, don’t oversell our findings, and follow arguments wherever they lead—even when the conclusions are uncomfortable or threaten projects we care about.
Openness
We take unusual ideas seriously and are willing to consider approaches that seem weird or unconventional if the reasoning is sound. We default to transparency about our reasoning, funding, mistakes, and internal debates. We make our work easy to scrutinise and critique, remain accessible to people from different backgrounds, and share knowledge rather than hoarding it. We normalise admitting when we get things wrong and create cultures where people can acknowledge mistakes without fear, while still maintaining accountability.
Acting with integrity
We align our behaviour with our stated values. This means being honest even when it’s costly, keeping our commitments, and treating people ethically regardless of their status or usefulness to our goals. How we conduct ourselves—especially toward those with less power—reflects our actual values more than our stated principles. We hold ourselves and our institutions to high standards of personal and professional conduct, recognising that being trustworthy is foundational to everything else.
...where’d the collaborative spirit go? The rest is mostly relabeling, so I’d let it slide, but that does seem like a glaring omission. Did EAs helping each other not poll well in a non-EA focus group or something?
Someone should write a good, linkable online resource describing the concept of the long reflection. It’s very strange that there isn’t a simple post/webpage that I can link to that gives a good, medium-depth description.
Currently the best things are probably the EA Forum Topic page, and this list of quotes.
Hi Peter, good to meet you! If you are interested in the long reflection you might be interested in my research which I will link here which is on the broader class of interventions that the long reflection belongs, I really appreciate any feedback or comments on it.
Additionally, if this is something you’re interested in, you might be interested in this as a future forum debate topic. I raised it as a potential candidate here, I’m really hoping it gets enough initial upvotes to be a finalist candidate as I really think it’s an important crux for whether or not we achieve a highly valuable future!
There’s now also the related concept of viatopia, which is maybe a better concept/term. Not sure what the very best links on that are but this one seems a good starting point.
I’ve seen AI-based animal communication technologies starting to be involved in some EA events / discussions (e.g. https://www.earthspecies.org/ ). I’m worried these initiatives may be actively negative, and I’m wondering if anyone has / will articulate a stronger defense of why they’re good?
The high-level argument I’ve heard is that communicating with animals will make humans be more empathetic towards them. But I don’t see why this would be the most likely outcome:
Humans are already fairly empathetic to animals, especially around things that we’d consider important welfare issues. We don’t need a hen to articulately describe why she’d prefer not to have her beak cut off or be kept in a cage, I think it would be fairly obvious to most people.
Animals might become less sympathetic if we knew what they were saying. It seems possible that most of their thoughts and words are about food, sex, and ingroup / outgroup dynamics.
A similar argument is that communication would allow us to see that animals are actually intelligent, but again I don’t see why this is necessarily the case. If their thoughts are things people would generally consider crude, it’s possible people would become more confident in their lack of intelligence (despite still deserving moral consideration).
More importantly, a large effect of being able to communicate with animals is that they’ll become more useful to humans. If animals had political power or legal rights, this might open the door to mutually beneficial trade. But in reality, they don’t have these things, so it seems more likely that this would allow humans to exploit these species more easily. They reason chickens, cows, and pigs are in such a bad state is because they’re very useful to humans, and I’m worried animal communication technologies will subject more species to similar fates.
Here’s my current four-point argument for AI risk/danger from misaligned AIs.
We are on the path of creating intelligences capable of being better than humans at almost all economically and militarily relevant tasks.
There are strong selection pressures and trends to make these intelligences into goal-seeking minds acting in the real world, rather than disembodied high-IQ pattern-matchers.
Unlike traditional software, we have little ability to know or control what these goal-seeking minds will do, only directional input.
Minds much better than humans at seeking their goals, with goals different enough from our own, may end us all, either as a preventative measure or side effect.
Request for feedback: I’m curious whether there are points that people think I’m critically missing, and/or ways that these arguments would not be convincing to “normal people.” Original goal.
I think that your list is really great! As a person who try to understand misaligned AI better, this is my arguments:
The difference between a human and an AGI might be greater than the difference between a human and a mushroom.
If the difference is that great, it will probably not make much difference between a cow and a human. The way humans treat other animals, the planet and each other makes it hard to see how we could possibly create AI alignment that is willing to save a creature like us.
If AGI has self-perservation, we are the only creatures that can threat their existence. Which means that they might want to make sure that we didn’t exist anymore just to be safe.
AGI is a thing that we know nothing about. If it cane a spaceship with aliens, we would probably use enormous resources to make sure it would not threat our planet. But now we are creating this alien creature ourselves and don’t do very much to make sure it isn’t a threat to our planet.
I hope my list helps!
2 thoughts here just thinking about persuasiveness. I’m not quite sure what you mean by normal people and also if you still want your arguments to be actually arguments or just persuasion-max.
show don’t tell for 1-3
For anyone who hasn’t intimately used frontier models but is willing to with an open mind, I’d guess you should just push them to use and actually engage mentally with them and their thought traces, even better if you can convince them to use something agentic like CC.
Ask and/or tell stories for 4
What can history tell us about what happens when a significantly more tech savy/powerful nation finds another one?
no “right” answer here though the general arc of history is that significantly more powerful nations capture/kill/etc.
What would it be like to be a native during various european conquests in the new world (esp ignoring effects of smallpox/disease to the extent you can)?
Incan perspective? Mayan?
I especially like Orellena’s first expedition down the amazon. As far as I can tell, Orellena was not especially bloodthirsty, had some interest/respect for natives. Though he is certainly misaligned with the natives.
Even if Orellana is “less bloodthirsty,” you still don’t want to be a native on that river. You hear fragmented rumors—trade, disease, violence—with no shared narrative; you don’t know what these outsiders want or what their weapons do; you don’t know whether letting them land changes the local equilibrium by enabling alliances with your enemies; and you don’t know whether the boat carries Orellana or someone worse.
Do you trade? attack? flee? coordinate? Any move could be fatal, and the entire situation destabilizes before anyone has to decide “we should exterminate them.”
and for all of these situations you can actually see what happened (approximately) and usually it doesn’t end well.
Why is AI different?
not rhetorical and gives them space to think in a smaller, more structured way that doesn’t force an answer.
I have many disagreements, but I’ll focus on one: I think point 2 is in contradiction with points 3 and 4. To put it it plainly: the “selection pressures” go away pretty quickly if we don’t have reliable methods of knowing or controlling what the AI will do, or preventing it from doing noticeably bad stuff. That applies to the obvious stuff like if AI tries to prematurely go skynet, but it also applies to more mundane stuff like getting an AI to act reliably more than 99% of the time.
I believe that if we manage to control AI enough to make widespread rollout feasible, then it’s pretty likely we’ve already solved alignment well enough to prevent extinction.
Hmm right now this seems wrong to me, and also not worth going into in an introductory post. Do you have a sense that your view is commonplace? (eg from talking to many people not involved in AI)
I’m pro-nuclear, but the commonly used EA framing of “nuclear is overregulated” seems net negative more often than not. Clearer Thinking’s new nuclear episode is one of the more epistemically rigorous discussions I’ve heard in EA-adjacent spaces (and Founders Pledge has also done nuanced work).
Nuclear is worth pursuing, but we should argue for it clear-eyed.
Net negative because it is a true statement? Or some other reason?
Good question — I think it’s mostly untrue as commonly used. It implies regulation is the main bottleneck, but as the podcast lays out, there are likely much better levers for driving down cost. So it’s both misleading and counterproductive as a talking point, even if you’re broadly pro-nuclear (which I and the podcast guest are).
Out of curiosity: Where have EAs argued that “nuclear is overregulated” and, more specifically, where have EAs argued that over-regulation is the only or dominant driver of the cost problem?
It’s probably true that this sometimes happens—especially when EAs outside of climate/energy point to “nuclear is overregulated” as something in line with libertarian / abundance-y priors—but I think those in EA that have done work on nuclear would not subscribe to or spread the view that regulation is the only driver of nuclear problem.
That said, it seems clearly true—and I do think Isabelle agrees with that—that regulatory reform is a necessary component of making nuclear in the West buildable at scale again (alongside many other factors, such as sustained political will, technological progress, re-established supply chains, valuing clean firm power for its attributes, etc).
Good question. I agree: people in EA who’ve actually worked on nuclear don’t usually claim over-regulation is the only or even dominant driver of the cost/buildout problem.
What I’m reacting to is more the “hot take” version that shows up in EA-adjacent podcasts — often as an analogy when people talk about AI policy: “look at nuclear, it got over-regulated and basically died, so don’t do that to AI.” In that context it’s not argued carefully, it’s just used as a rhetorical example, and (to me) it’s a pretty lossy / misleading compression of what’s going on.
So I’m not trying to call out serious nuclear work in EA — I’m mostly sharing the Clearer Thinking episode as a good “orientation reset” because it keeps pointing back to what the binding constraints plausibly are, with regulation as one (maybe not even the main) piece of a complex situation.
Also possible I’m misremembering some of the specific instances — I haven’t kept notes — but I’ve heard the framing enough that it started to rub me the wrong way.
And I’m genuinely curious where you land on the “regulatory reform is necessary” point: do you think the key thing is removing regulation, changing it, or adding policy/market design (e.g. electricity market reform / stable revenue mechanisms / valuing clean firm power)? I’m currently leaning toward “markets/revenue model is the real lever”, but I’m not confident.
One thing I loved reading was a model of Sweden’s total system cost with vs without nuclear (incl. stuff like transmission build-out). It suggested fairly similar overall cost in both worlds — but the nuclear-heavy system leaned more on established tech (less batteries, etc., and I don’t remember if demand response was included).
My read is that the real challenge is: even if total system costs are comparable, how do you actually allocate those costs and rewards in something resembling a market so the “good” system gets built? (Unless you go much more “total state-owned super regulated” and basically nationalising the whole thing.)
I agree it’s a bit lossy and sometimes reflexive (this is what I meant with relying on libertarian priors), but I am still confused about your argument.
Because the argument you criticize is an historical one (“nuclear over regulation killed nuclear”) which is different from “now we need many steps and there are different strategies to make nuclear more competitive again”.
I think it is basically correct that over-regulation played a huge part in making nuclear uncompetitive and I don’t think that Isabelle or others knowing the history of nuclear energy would disagree with that, even if it might be a bit overglossed / stylized (obviously, it is not the only thing).
Ah, now I see—thanks for clarifying. Yes historically I do not know how much each set-back to nuclear mattered. I can see that e.g. constantly changing regulation, for example during builds (which I think Isabelle actually mentioned) could cause a significant hurdle for continuing build-out. Here I would defer to other experts like you and Isabelle.
Porting this over to “we might over regulate AI too”, I am realizing it is actually unclear to me whether people who use the “nuclear is over regulated” example means the literal same “historical” thing could happen to AI:
--- We put in place constantly changing regulation on AI
—This causes large training runs to be stopped mid way
—In the end this makes AI uncompetitive with humans and the AI never really takes off
—Removing regulation is not able to kick start the industry, as talent has left and we no longer know how to build large language models cost effectively
Writing this I still think I stand by my point that there are much better examples in terms of regulation holding progress back (speeding up vaccine development actually being such an EA cause area, human challenge trials etc.). I can lay out the arguments for why this is so if helpful. But it is basically something like “there is probably much more path dependency in nuclear compared to AI or pharma”.
U.S. Politics should be a main focus of US EAs right now. In the past year alone, every major EA cause area has been greatly hurt or bottlenecked by Trump. $40 billion in global health and international development funds was lost when USAID shut down, which some researchers project could lead to 14 million more deaths by 2030. Trump has signed an Executive Order that aims to block states from creating their own AI regulations, and has allowed our most powerful chips to be exported to China. Trump has withdrawn funding from, and U.S. support for, international governance bodies like the United Nations and the World Health Organization, thereby removing the world’s most influential country from the collaborative efforts necessary to combat climate change and global pandemics. Most recently, the administration even changed nutritional guidelines, encouraging Americans to eat more animal protein than ever, which could drive more demand for unethically-produced animal products. In addition to all of this, Trump has continuously acted undemocratically, brazenly breaking norms and laws meant to protect us from autocracy and dictatorship. This goes to show just how determinative U.S. politics is to our successes and our failures.
I agree. Basically anyone not in a politically sensitive role (this category is broader than it might intuitively seem) should be looking to make large donations in this area now and others should be reaching out to EAs focused on US politics if they feel well equipped to run or contribute to a high leverage project.
Unfortunately there is no AMF/GiveDirectly for politics and most things you can donate too are very poorly leveraged. Likewise it is hard to both scope a leveraged project and execute well on it. I know of one general exception at the moment which I’m happy to recommend privately.
I’m also happy to speak to anyone who intends to devote considerable money or work resources to this and pass them along to the people doing the best work here if that makes sense.
On the bright side, we might end up getting an AI pause out of this, if the Netherlands wakes up and decides that it no longer wants to help supply chips for advanced AI which could either be (a) misaligned or (b) controlled by Trump. See previous discussion, protest. I reckon this moment represents a strong opportunity for Dutch EAs concerned with AI risks. Maybe get a TV interview where you explain how ASML is supplying chips to the US, then explain AI risk, etc.
In terms of red-teaming my own suggestion, I am somewhat worried about further politicizing the issue of AI / highlighting national rivalries. Seems best to push for symmetric restrictions on China—they are directly supplying materials to Russia for its war in Ukraine, after all. Eliezer Yudkowsky could be an interesting person to contact for red-teaming purposes, since he’s strongly in favor of an AI pause, but also seems to resist any “international rivalry” framing of AI risk concerns?
I agree that this is a very important issue right now, but I’m not sure what we can do about it.
https://www.powerfordemocracies.org/research/our-recommendations/ !!
I think by the nature of how the EA Forum works, any proposed solution is likely to be more controversial than a generic “someone should do something about US politics” message. So any proposed solution will get at least a few early downvotes, causing low visibility. EAs want to upvote things which feel official and authoritative. They usually seem uninterested in improvisational brainstorming in response to an evolving situation. This will cause a paradoxical result where despite the “someone should do something about US politics” talk, proposing solutions will feel like a waste of time.
Maybe it would be good to create a dedicated brainstorming thread to try and mitigate this a little bit.
More good news! Norwegian meat industry announced that they will stop using fast-growing chicken breeds by the end of 2027. These breeds are source of immense suffering due to the toll such rapid growth takes on animal’s body.
This will be the first country to stop using them.
More here: https://animainternational.org/blog/norway-ends-fast-growing-chickens
Reminder: claim tax relief on charitable donations (UK PAYE taxpayers)
If you:
Pay the higher rate of tax in the UK (earn over £50,271 or £43,663 in Scotland)
Don’t fill in a Self Assessment tax return (you pay tax automatically via PAYE)
Made donations that you claimed Gift Aid on in this or any of the previous 4 tax years
You can use this HMRC link to tell HMRC how much you’ve donated excluding Gift Aid and claim back the difference. More details in this evergreen post.
Practical tip
I set my Giving What We Can pledge tracking to run from 6 April to 5 April, which matches the UK tax year. That makes it easy to report the correct annual figure to HMRC.
Important notes / assumptions
This does not apply if you donate through Give As You Earn (the relief is already applied).
If you’re claiming relief on £10,000 or more of donations, you must also tell HMRC:
the date(s) of the donation(s)
which charity you donated to
The European Parliament recently submitted a parliamentary question on wild animal welfare! The question focuses on human caused wild animal suffering and such questions generally don’t have policy implications—but still, was surprised to see this topic being taken up in policy discourse.
https://www.europarl.europa.eu/doceo/document/E-10-2025-004965_EN.html
Are there any signs of governments beginning to do serious planning for the need for Universal Basic Income (UBI) or negative income tax...it feels like there’s a real lack of urgency/rigour in policy engagement within government circles. The concept has obviously had its high-level advocates a la Altman but it still feels incredibly distant as any form of reality.
Meanwhile the impact is being seen in job markets right now—in the UK graduate job opening have plummeted in the last 12 months. People I know are having a hard enough time finding jobs with elite academic backgrounds—let alone the vast majority of people who went to average universities. This is happening today—before there’s any consensus of arrival of AGI and widely recognised mass displacement in mid-career job markets. Impact is happening now, but preparation for major policy intervention in current fiscal scenarios seems really far off. If governments do view the risk of major employment market disruption as a realistic possibility (which I believe in many cases they do) are they planning for interventions behind the scene? Or do they view the problem as too big to address until it arrives...viewing rapid response > careful planning in the way the COVID emergency fiscal interventions emerged.
Would be really interested to hear of any good examples of serious thinking/preparation of how some form of UBI could be planned for (logistically and fiscally) in the near time 5 year horizon.
Hmm, I think that’s not the right framing for this. UBI is just not settled as a universally good idea in academic or political circles (sorry, no definitive citation for this), let alone that there’s an urgent unemployment crisis (the statistic I think you’re citing is for job openings, not actual employment rates) or that such a crisis, if it did exist, has structural causes which could be expected to increase (i.e. it might not be AI, nor should we necessarily expect AI to become orders of magnitude more advanced in the next 5 years; there was plausibly a very different shock to the global economic system beginning around Liberation Day, 2025).
Thanks for your take—I always appreciate slightly less doom and gloom perspectives.
On your point that there’s not an imminent unemployment crisis and what impacts we are seeing may be due to other factors. Firstly I think it’s inevitable that the direct causes of disruption to the labour market are going to be multifaceted given the current trajectory of global markets (de-coupling, de-globalisation etc.)whatever happens moving forward. In the UK specifically part of the issue is minimum wage has been increased, making employers less inclined to hire grads (and yes it’s grad openings which have halved, but we’re simultaneously seeing 18-25 unemployment rising...not yet anywhere close to 08 levels but gradually increasing) - but this and other factors aren’t independent of AI, but rather accelerate it’s impacts (i.e. right now AI probably can’t replace all grad jobs, but CEOs are more willing to experiment, explore, and begin premature replacement because of rising grad costs and higher tax rates...but now those jobs are gone they’re off the market. The more broad market shocks there are, the more businesses will look to AI to cut costs)
Secondly I definitely take your point that AI may not become orders of magnitude more capable in the next five years and that ‘AI is coming for our jobs’ could be overblown. I suppose my thinking is—it might. Even if there is say, a 30% chance that in the next 5 years even 30-40% of white collar jobs get replaced...that feels like a massive shock to high-income countries way of life, and a major shift in the social contract for the generation of kids who have got into massive debt for university courses only to find substantially reduced market opportunities. That requires substantial state action.
And that’s the more conservative end of things; if there’s any reasonable chance that within even the next 10-15 years AI becomes capable of replacing a higher percentile of the total job market, surely we need some form of effective way of ensuring people with no job opportunities have some degree of resources. Even if you’re correct that it’s fairly unlikely—to avoid major social instability in the event of such a scenario it feels prudent that governments would be doing serious planning for the potential (just as they would for pandemics or war gaming no matter how unlikely the scenario)
And finally—you may well be right that UBI is not accepted as a good idea in political circles. I’m not wedded to that particular approach, and have heard ideas about negative income tax floating around. But in a scenario where there simply isn’t sufficient available employment for a functioning labour market and allocation of basic resources to all citizens—I’d like to think that governments are putting serious thinking and planning into how we ensure society continues to function, and what the social contract might become, and not that we’re waiting until that reality comes to pass to plan for an appropriate response.
Technical Alignment Research Accelerator (TARA) applications close today!
Last chance to apply to join the 14-week, remotely taught, in-person run program (based on the ARENA curriculum) designed to accelerate APAC talent towards meaningful technical AI safety research.
TARA is built for you to learn around full-time work or study by attending meetings in your home city on Saturdays and doing independent study throughout the week. Finish the program with a project to add to your portfolio, key technical AI safety skills, and connections across APAC.
See this post for more information and apply through our website here.
Super sceptical probably very highly intractable thought that I haven’t done any research on: There seem to be a lot of reasons to think we might be living in a simulation besides just Nick Bostrom’s simulation argument, like:
All the fundamental constants and properties of the universe are perfectly suited to the emergence of sentient life. This could be explained by the Anthropic principle, or it could be explained by us living in a simulation that has been designed for us.
The Fermi Paradox: there don’t seem to be any other civilizations in the observable universe. There are many explanations for the Fermi Paradox, but one additional explanation might be that whoever is simulating the universe created it for us, or they don’t care about other civilizations, so haven’t simulated them.
We seem to be really early on in human history. Only about 60 billion people have ever lived IIRC but we expect many trillions to live in the future. This can be explained by the Doomsday argument—that in fact we are in the time in human history where most people will live because we will soon go extinct. However, this phenomenon can also be explained by us living in a simulation—see next point.
Not only are we really early, but we seem to be living at a pivotal moment in human history that is super interesting. We are about to create intelligence greater than ourselves, expand into space, or probably all die. Like if any time in history were to be simulated, I think there’s a high likelihood it would be now.
If I was pushed into a corner, I might say the probability we are living in a simulation is like 60%, where most evidence seems to point towards us being in a simulation. However, the doubt comes from the high probability that I’m just thinking about this all wrong—like, of course I can come up with a motivation for a simulation to explain any feature of the universe… it would be hard to find something that doesn’t line up with an explanation that the simulators just being interested in that particular thing. But in any case, that’s still a really high probability of everyone I love potentially not being sentient or even real (fingers crossed we’re all in the simulation together). Also, being in a simulation would change our fundamental assumptions about the universe and life, and it be really weird if that had no impact on moral decision-making.
But everyone I talk to seems to have a relaxed approach to it, like it’s impossible to make any progress on this and that it couldn’t possibly be decision-relevant. But really, how many people have worked on figuring it out with a longtermist or EA-mindset? Some reasons it might be decision-relevant:
We may be able to infer from the nature of the universe and the natural problems ahead of us what the simulators are looking to understand or gain from the simulation (or at least we might attach percentage likelihoods to different goals). Maybe there are good arguments to aim to please the simulators, or not. Maybe we end the simulation if there are end-conditions?
Being in a simulation gives some weight to the probability that aliens exist (they probably have a lower probability of existing if we are in a simulation), which helps with long-term grand planning. Like, we wouldn’t need to worry about integrating defenses against alien attacks or engaging in acausal trade with aliens.
We can disregard arguments like The Doomsday Argument, lowering our p(doom)
Some questions I’d ask is:
How much effort have we put into figuring out if there is something decision-relevant to do about this from a moral impact perspective? How much effort should we put into this?
How much effort has gone into figuring out if we are, in fact, in a simulation, using empiricism? What might we expect to see in a simulated universe vs a real world? How we can we search for and detect that?
Overall, this does sounds nuts to me and it probably shouldn’t go further than this quick take, but I do feel like there could be something here, and it’s probably worth a bit more attention than I think it has gotten (like 1 person doing a proper research project on it at least). Lots of other stuff sounded crazy but now has significant work and (arguably) great progress, like trying to help people billions of years in the future, working on problems associated with digital sentience, and addressing wild animal welfare. There could be something here and I’d be interested in hearing thoughts (especially a good counterargument to working on this so I don’t have to think about it anymore) or learning about past efforts.
All the things you mentioned aren’t uniquely evidence for the simulation hypothesis but are equally evidence for a number of other hypotheses, such as the existence of a supernatural, personal God who designed and created the universe. (There are endless variations on this hypothesis, and we could come up endless more.)
The fine-tuning argument is a common argument for the existence of a supernatural, personal God. The appearance of fine-tuning supports this conclusion equally as well it supports the simulation hypothesis.
Some young Earth creationists believe that dinosaur fossils and other evidence of an old Earth were intentionally put there by God to test people’s faith. You might also think that God tests our faith in other ways, or plays tricks, or gets easily bored, and creates the appearance of a long history or a distant future that isn’t really there. (I also think it’s just not true that this is the most interesting point in history.)
Similarly, the book of Genesis says that God created humans in his image. Maybe he didn’t create aliens with high-tech civilizations because he’s only interested in beings with high technology made in his image.
It might not be God who is doing this, but in fact an evil demon, as Descartes famously discussed in his Meditations around 400 years ago. Or it could be some kind of trickster deity like Loki who is neither fully good or fully evil. There are endless ideas that would slot in equally well to replace the simulation hypothesis.
You might think the simulation hypothesis is preferable because it’s a naturalistic hypothesis and these are supernatural hypotheses. But this is wrong, the simulation hypothesis is a supernatural hypothesis. If there are simulators, the reality they live in is stipulated to have different fundamental laws of nature, such as the laws of physics, than exist in what we perceive to be the universe. For example, in the simulators’ reality, maybe the fundamental relationship between consciousness and physical phenomena such as matter, energy, space, time, and physical forces is such that consciousness can directly, automatically shape physical phenomena to its will. If we observed this happening in our universe, we would describe this as magic or a miracle.
Whether you call them “simulators” or “God” or an “evil demon” or “Loki”, and whether you call it a “simulation” or an “illusion” or a “dream”, these are just different surface-level labels for substantially the same idea. If you stipulate laws of nature radically other than the ones we believe we have, what you’re talking about is supernatural.
If you try to assume that the physics and other laws of nature in the simulators’ reality is the same as in our perceived reality, then the simulation argument runs into a logical self-contradiction, as pointed out by the physicist Sean Carroll. Endlessly nested levels of simulation means computation in the original simulators’ reality will run out. Simulations at the bottom of the nested hierarchy, which don’t have enough computation to run still more simulations inside them, will outnumber higher-level simulations. Since the simulation argument says, as one of its key premises, that in our perceived reality we will be able to create simulations of worlds or universes filled with many digital minds, but the simulation hypothesis implies this is actually impossible, then the simulation argument’s conclusion contradicts one of its premises.
There are other strong reasons to reject the simulation argument. Remember that a key premise is that we ourselves or our descendants will want to make simulations. Really? They’ll want to simulate the Holocaust, malaria, tsunamis, cancer, cluster headaches, car crashes, sudden infant death syndrome, and Guantanamo Bay? Why? On our ethical views today, we would not see this as permissible, but rather the most grievous evil. Why would our descendants feel differently?
Less strongly, computation is abundant in the universe but still finite. Why spend computation on creating digital minds inside simulations when there is always a trade-off between doing that and creating digital minds in our universe, i.e. the real world? If we or our descendants think marginally and hold as one of our highest goals to maximize the number of future lives with a good quality of life, using huge amounts of computation on simulations might be seen as going against that goal. Plus, there are endlessly more things we could do with our finite resource of computation, most we can’t imagine today. Where would creating simulations fall on the list?
You can argue that creating simulations would be a small fraction of overall resources. I’m not sure that’s actually true; I haven’t done the math. But just because something is a small fraction of overall resources doesn’t mean it will be likely be done. In an interstellar, transhumanist scenario, our descendants could create a diamond statue of Hatsune Miku the size of the solar system and this would take a tiny percentage of overall resources, but that doesn’t mean it will likely happen. The simulation argument specifically claims that making simulations of early 21st century Earth will interest our descendants more than alternative uses of resources. Why? Maybe they’ll be more interested in a million other things.
Overall, the simulation hypothesis is undisprovable but no more credible than an unlimited number of other undisprovable hypotheses. If something seems nuts, it probably is. Initially, you might not be able to point out the specific logical reasons it’s nuts. But that’s to be expected — the sort of paradoxes and thought experiments that get a lot of attention (that “go viral”, so to speak) are the ones that are hard to immediately counterargue.
Philosophy is replete with oddball ideas that are hard to convincingly refute at first blush. The Chinese Room is a prime example. Another random example is the argument that utilitarianism is compatible with slavery. With enough time and attention, refutations may come. I don’t think one’s inability to immediately articulate the logical counterargument is a sign that an oddball idea is correct. It’s just that thinking takes time and, usually, by the time an oddball idea reaches your desk, it’s proven to be resistant to immediate refutation. So, trust that intuition that something is nuts.
Strong upvoted as that was possibly the most compelling rebuttal to the simulation argument I’ve seen in quite a while, which was refreshing for my peace of mind.
That being said, it mainly targets the idea of a large-scale simulation of our entire world. What about the possibility that the simulation is for a single entity and that the rest of the world is simulated at a lower fidelity? I had the thought that a way to potentially maximize future lives of good quality would be to contain each conscious life in a separate simulation where they live reasonably good lives catered to their preferences, with the apparent rest of the world being virtual. Given, I doubt this conjecture because in my own opinion my life doesn’t seem that great, but it seems plausible at least?
Also, that line about the diamond statue of Hatsune Miku was very, very amusing to this former otaku.
Changing the simulation hypothesis from a simulation of a world full of people to a simulation of an individual throws the simulation argument out the window. Here is how Sean Carroll articulates the first three steps of the simulation argument:
We can easily imagine creating many simulated civilizations.
Things that are that easy to imagine are likely to happen, at least somewhere in the universe.
Therefore, there are probably many civilizations being simulated within the lifetime of our universe. Enough that there are many more simulated people than people like us.
The simulation argument doesn’t apply to you, as an individual. Unless you think that you, personally, are going to create a simulation of a world or an individual — which obviously you’re not.
Changing the simulation hypothesis from a world-scale simulation to an individual-scale simulation also doesn’t change the other arguments against the simulation hypothesis:
The bottoming out argument. This is the one from Sean Carroll. Even if we supposed you, personally, were going to create individual-scale simulations in the future, eventually a nesting cascade of such simulations would exhaust available computation in the top-level universe, i.e. the real universe. The bottom-level simulations within which no further simulations are possible would outnumber higher-level ones. The conclusion of the simulation argument contradicts a necessary premise.[1]
The ethical argument. It would be extremely unethical to imprison an individual in a simulation without their consent, especially a simulation with a significant amount of pain and suffering that the simulators are programming in. Would you create an individual-scale simulation even of an unrealistically pleasant life, let alone a life with significant pain and suffering? If we had the technology to do this today, I think it would be illegal. It would be analogous to false imprisonment, kidnapping, torture, or criminal child abuse (since you are creating this person).
The computational waste argument. The amount of computation required to make an individual-scale simulation would require at least as much computation as creating a digital mind in the real universe. In fact, it would require more computation, since you also to have to simulate the whole world around the individual, not just the individual themselves. If the simulators think marginally, they would prefer to use these resources to create a digital mind in the real universe or put them to some other, better use.
If the point of the simulation is to cater it to the individual’s preferences, we should ask:
a) Why isn’t this actually happening? Why is there so much unnecessary pain and suffering and unpleasantness in every individual’s life? Why simulate the covid-19 pandemic?
b) Why not cater to the individual’s fundamental and overriding preference not to be in a simulation?
c) Why not put these resources toward any number of superior uses that must surely exist?[2]
Perhaps most importantly, changing the simulation hypothesis from world-scale to individual-scale doesn’t change perhaps the most powerful counterargument to the simulation hypothesis:
The unlimited arbitrary, undisprovable hypotheses argument. There is no reason to think the simulation hypothesis makes any more sense or is any more likely to be true than the hypothesis that the world you perceive is an illusion created by an evil demon or a trickster deity like Loki. There are an unlimited number of equally arbitrary and equally unjustified hypotheses of this type that could be generated. In my previous comment, I argued that versions of the simulation hypotheses in which the laws of physics or laws of nature are radically different in the real universe than in the simulation are supernatural hypotheses. Versions of the simulation hypothesis that assume real universe physics is the same as simulation physics suffer from the bottoming out argument and the computational waste argument. So, either way, the simulation hypothesis should be rejected. (Also, whether the simulation has real universe physics or not, the ethical argument applies — another reason to reject it.)
This argument also calls into question why we should think simulation physics is the same as real universe physics, i.e. why we should think the simulation hypothesis makes more sense as a naturalistic hypothesis than a supernatural hypothesis. The simulation hypothesis leans a lot on the idea that humans or post-humans in our hypothetical future will want to create “ancestor simulations”, i.e. realistic simulations of the simulators’ past, which is our present. If there were simulations, why would ancestor simulations be the most common type? Fantasy novels are about equally popular as historical fiction or non-fiction books about history. Would simulations skew toward historical realism significantly more than books currently do? Why not simulate worlds with magic or other supernatural phenomena? (Maybe we should conclude that, since this is more interesting, ghosts probably exist in our simulation. Maybe God is simulated too?) The “ancestor simulation” idea is doing a lot of heavy lifting; it’s not clear that this is in any way a justifiable assumption rather than an arbitrary one. The more I dig into the reasoning behind the simulation hypothesis, the more it feels like Calvinball.[3]
The individual-scale simulation hypothesis also introduces new problems that are unique to it:
Simulation of other minds. If you wanted to build a robot that could perfectly simulate the humans you know best, the underlying software would need to be a digital mind. Since, on the individual-scale simulation hypothesis, you are a digital mind, then the other minds in the simulation — at least the ones you know well — are as real as you are. You could try to argue that these other minds only need to be partially simulated. For example, the mind simulations don’t need to be running when you aren’t observing or interacting with these people. But then why don’t these people report memory gaps? If the answer is that the simulation fills in the gaps with false memories, what process continually generates new false memories? Why would this process be less computationally expensive than just running the simulation normally? (You could also try to say that consciousness is some kind of switch that can be flipped on or off for some simulations but not others. But I can’t think of any theory of consciousness this would be compatible with, and it’s a problem for the individual-scale simulation hypothesis if it just starts making stuff up ad hoc to fit the hypothesis.)
If we decide that at least the people you know well must be fully simulated, in the same way you are, then what about the people they know well? What about the people who they know well know well? If everyone in the world is connected through six degrees of separation or fewer, then it seems like individual-scale simulations are actually impossible and all simulations must be world-scale simulations.
Abandoning the simulation of history at large scale. Individual-scale simulations don’t provide the same informational value that world-scale simulations might. When people talk about why “ancestor simulations” would supposedly be valuable or desired, they usually appeal to the notion of simulating historical events on a large scale. This obviously wouldn’t apply to individual-scale simulations. To the extent credence toward the simulation hypothesis depends on this, an individual-scale simulation hypothesis may be even less credible than a world-scale simulation hypothesis.
The Wikipedia page on the simulation hypothesis notes that it’s a contemporary twist on a centuries-old if not millennia-old idea. We’ve replaced dreams and evil demons with computers, but the underlying idea is largely the same. The reasons to reject it are largely the same, although the simulation argument has some unique weaknesses. That page is a good resource for finding still more arguments against the simulation hypothesis.[4]
Carroll, who is a physicist and cosmologist, also criticizes the anthropic reasoning of the simulation argument. I recommend reading his post, it’s short and well-written.
You could try to argue that, despite society’s best efforts, it will be impossible to supress a large number of simulations from being created. Pursuing this line of argument re quires speculating about the specific details of a distant, transhuman or post-human future. Would an individual creating a simulation be more like an individual today operating a meth lab or launching a nuclear ICBM? I’m not sure we can know the answer to this question. If dangerous or banned technologies can’t be controlled, what does this say about existential risk? Will far future, post-human terrorists be able to deploy doomsday devices? If so, that would undermine the simulation argument. (Will post-humans even have the desire to be terrorists, or is that a defect of humanity?)
Related to this are various arguments that the simulation argument is self-defeating. We infer things about the real universe from our perceived universe. We then conclude that our perceived universe is a simulation. But, if it is, this undermines our ability to infer anything about the real universe from our perceived universe. In fact, this undermines the inference that our perceived universe is a simulation within a real universe. So, the simulation argument defeats itself.
In addition to all the above, I would be curious to hear empirical, scientific arguments about the amount of computation that might be required for world-scale simulations, which would be partly applicable to individual-scale simulations. Obviously, our universe can’t run a full-scale, one-to-one simulation of our universe with perfect fidelity — that would require more computation, matter, and energy than our universe has. If you only simulate the solar system with perfect fidelity, you can pare that down a lot. You can make other assumptions to pare down the computation required. It’s much less important than all the arguments and considerations described above, but if we get a better understanding of approximately how difficult or costly a world-scale simulation might be, that could help put some considerations like computational waste in perspective.
I would not describe the finetuning argument and the Fermi paradox as strong evidence in favour of the simulation hypothesis. I would instead say that they are open questions for which a lot of different explanations have been proposed, with the simulation offering only one of many possible resolutions.
As to the “importance” argument, we shouldn’t count speculative future events as evidence of the importance of now. I would say the mid-20th century was more important than today, because that’s the closest we ever got to nuclear annihilation (plus like, WW2).
I’ve thought about this a lot too. My general response is that it is very hard to see what one could do differently at a moment to moment level even if we were in a simulation. While it’s possible that you or I are alone in the simulation, we can’t, realistically, know this. We can’t know with much certainty that the apparently sentient beings who share our world aren’t actually sentient. And so, even if they are part of the simulation, we still have a moral duty to treat them well, on the chance they are capable of subjective experiences and can suffer or feel happiness (assuming you’re a Utilitarian), or have rights/autonomy to be respected, etc.
We also have no idea who the simulators are and what purpose they have for the simulation. For all we know, we are petri dish for some aliens, or a sitcom for our descendents, or a way for people’s minds on colony ships travelling to distant galaxies to spend their time while in physical stasis. Odds are, if the simulators are real, they’ll just make us forget about whatever if we finally figure it out, so they can continue it for whatever reasons.
Given all this, I don’t see the point in trying to defy them or doing really anything differently than what you’d do if this was the ground truth reality. Trying to do something like attempting to escape the simulation would most likely fail AND risk getting you needlessly hurt in this world in the process.
If we’re alone in the sim, then it doesn’t matter what we do anyway, so I focus on the possibility that we aren’t alone, and everything we do does, in fact, matter. Give it the benefit of the doubt.
At least, that’s the way I see things right now. Your mileage may vary.
I made this simple high-level diagram of critical longtermist “root factors”, “ultimate scenarios”, and “ultimate outcomes”, focusing on the impact of AI during the TAI transition.
This involved some adjustments to standard longtermist language.
“Accident Risk” → “AI Takeover
”Misuse Risk” → “Human-Caused Catastrophe”
“Systemic Risk” → This is spit up into a few modules, focusing on “Long-term Lock-in”, which I assume is the main threat.
You can read interact with it here, where there are (AI-generated) descriptions and pages for things.
Curious to get any feedback!
I’d love it if there could eventually be one or a few well-accepted and high-quality assortments like this. Right now some of the common longtermist concepts seem fairly unorganized and messy to me.
---
Reservations:
This is an early draft. There’s definitely parts I find inelegant. I’ve played with the final nodes instead being things like, “Pre-transition Catastrophe Risk” and “Post-Transition Expected Value”, for instance. I didn’t include a node for “Pre-transition value”; I think this can be added on, but would involve some complexity that didn’t seem worth it at this stage. The lines between nodes were mostly generated by Claude and could use more work.
This also heavily caters to the preferences and biases of the longtermist community, specifically some of the AI safety crowd.
Just finding about about this & crux website. So cool. Would love to see something like this for charity ranking (if it isn’t already somewhere on the site).
Don’t you need a philosophy axioms layer between outputs and outcomes? Existential catastrophe definitions seems to be assuming a lot of things.
Would also need to think harder about why/in what context i’m using this but “governance” being a subcomponent when it’s arguably more important/ can control literally everything else at the top level seems wrong.
Good points!
>Would love to see something like this for charity ranking (if it isn’t already somewhere on the site).
I could definitely see this being done in the future.
>Don’t you need a philosophy axioms layer between outputs and outcomes?
I’m nervous that this can get overwhelming quickly. I like the idea of starting with things that are clearly decision-relevant to the certain audience the website has, then expanding from there. Am open to ideas on better / more scalable approaches!
>”governance” being a subcomponent when it’s arguably more important/ can control literally everything else at the top level seems wrong.
Thanks! I’ll keep in mind. I’d flag that this is an extremely high-level diagram, meant more to be broad and elegant than to flag which nodes are most important. Many critical things are “just subcomponents”. I’d like to make further diagrams on many of the different smaller nodes.
According to someone I chatted to at a party (not normally the optimal way to identify top new cause areas!) fungi might be a worrying new source of pandemics because of climate change.
Apparently this is because thermal barriers prevented fungi from infecting humans, but because fungi are adapting to higher temperatures, they are now better able to overcome those barriers. This article has a bit more on this:
https://theecologist.org/2026/jan/06/age-fungi
Purportedly, this is even more scary than a pathogen you can catch from people, because you can catch this from the soil.
I suspect that if this were, in fact, the case, I would have heard about it sooner. Interested to hear comments from people who know more about it than me, or have more capacity than me to read up about it a bit.
Hi Sanjay,
When people ask me “What is one area or issue you wish people paid more attention to in global health?”, I almost always say fungal diseases.
I co-authored some reports on fungal infections (e.g., this one), and my impression is that it is indeed very plausible and well-recognized by experts that fungal infections will rise in a major way as a result of climate change, though I have not seen any guesses / estimates of how large the additional burden could be.
I think the more important point is that, regardless of climate change, fungal diseases are a massive disease burden source already. Fungal disease-related deaths are plausibly on the order of ~2M/year, likely more, and it is possible that DALYs are in a similar ballpark as TB, malaria, and HIV (though again unclear, because fungal diseases aren’t even comprehensively included in IHME’s global burden of disease estimates yet).
It is also incredibly neglected, to an extent that I find almost unbelievable. Though this has recently improved a bit, with more attention / funding from the Wellcome Trust coming in.
I think one reason that people aren’t jumping on fungal diseases despite high importance and neglectedness is that tractability is tricky. Fungal disease treatments are often not very effective, expensive, difficult to administer, and have lots of side effects. Also, there are LOTS of different fungal diseases, that all affect different populations, manifest differently, and require different diagnostics/treatment. So there isn’t really an easy one-size-fits-all solution here.
I do not find it surprising that you haven’t heard about it. Lots of people I know haven’t, and there are several reasons for this that are too long to explain here (though this article might help).
Maybe helpful for you to know that Coefficient Giving have done internal research on fungal diseases (they also commissioned our work on this topic), so they might have more thoughts on this.
Hi Jenny, very interesting, thank you. What was the response of CG to your report, and do you know if they are planning to invest more resources towards this potential cause area?
I’m not able to comment on CG’s reaction to the report, as those discussions are confidential.
What I can say is that they are still exploring this area internally (given that they commissioned us to do more work related to fungal diseases recently (see here)).
I’m not aware of any specific grantmaking decisions or commitments at this stage.
Thanks this is super interesting and definitely concerning.
FWIW within the non-EA Global Health Community this has been a topic of conversation for the last 3-4 years. It is potential threat, but still seems like a super low percentage Xish-risk, because...
a) We haven’t actually seen anything terribly dangerous happen yet
b) Antifungal medications are there, and if there was a super-dangerous-mass fungal threat I suspect we could make better ones pretty quicksmart. But yes this is far from guaranteed.
As a side note there are already plenty of pathogens we catch from the soil like anthrax and tetanus, as well as worms like hookworm!
The person I spoke to at the party said that he knew somebody who had a fungal infection and was likely to die from it.
I don’t know much about antifungals, but I infer from his comment that we don’t have enough antifungals to cover all of the potential fungal infections.
To my knowledge, there are a few (not actually that many) existing antifungals, but as I commented above, they mostly aren’t very good, and in several deadly fungal infections they are almost pointless.
Also, when a new fungal pathogen comes out, it might be harmless, or it might be big trouble, nobody can predict that. A good example I’ve seen mentioned a few times is Candida Auris (pretty serious and often deadly fungal infection) that emerged in 2009 independently in several regions of the world, pretty much out of nowhere. And the scary thing is that it was drug-resistant from the start! I think researchers aren’t quite sure why it emerged, but it could be related to climate change.
The idea of fungi evolving to infect humans and resulting in apocalypse underpins the premise of the famous game and TV series “The Last of Us”
Given the series’ critical acclaim and popularity, I wonder if it also demonstrates potential for engaging the public with this topic through mainstream popular media.
I was wondering if anyone was going to mention that. There was a lot of media buzz about whether the events of the show could really happen at the time of its airing. This piece by Yale is supposed to sound reassuring, but it just… doesn’t. :/
Among other things, the natural-atrocity take on zombies is what got me in love with the TV series; depressed by them but interested in this new aesthetic of dangerous nature globally killing human civilization, think overgrown moss on broken subways. I can indeed see things like it motivating EA people to prevent such things, very much involved with visions of such a world. 🪸
This idea gets discussed in infectious disease circles, but it is often framed more dramatically than the evidence supports. Fungi adapting to higher temperatures is real, Candida auris is a good example, but most fungi still struggle to survive in the human body and spread efficiently between people. Soil exposure already exists today, yet serious fungal infections remain rare and mostly affect immunocompromised individuals. It is a risk worth monitoring, not a hidden pandemic waiting to explode, which is likely why it has not triggered broader alarms outside specialist research.
What are people’s favorite arguments/articles/essays trying to lay out the simplest possible case for AI risk/danger?
Every single argument for AI danger/risk/safety I’ve seen seems to overcomplicate things. Either they have too many extraneous details, or they appeal to overly complex analogies, or they seem to spend much of their time responding to insider debates.
I might want to try my hand at writing the simplest possible argument that is still rigorous and clear, without being trapped by common pitfalls. To do that, I want to quickly survey the field so I can learn from the best existing work as well as avoid the mistakes they make.
my fave is @Duncan Sabien’s ‘Deadly by Default’
Max Tegmark explains it best I think. Very clear and compelling and you don’t need any technical background to understand what he’s saying.
I believe his third or maybe it was second appearance on Lex Fridman’s podcast where I first heard his strongest arguments, although those are quite long with extraneous content, here is a version that is just the arguments. His solutions are somewhat specific, but overall his explanation is very good I think:
Quick link-post highlighting Toner quoting Postrel’s dynamist rules + her commentary. I really like the dynamist rules as a part of the vision of the AGI future we should aim for:
“Postrel does describe five characteristics of ‘dynamist rules’:
I see some overlap with existing ideas in AI policy:
Transparency, everyone’s favorite consensus recommendation, fits well into a dynamist worldview. It helps with Postrel’s #1 (giving individuals access to better information that they can act on as they choose), #3 (facilitating commitments), and #4 (facilitating criticism and feedback). Ditto whistleblower protections.
Supporting the development of a third-party audit ecosystem also fits—it helps create and enforce credible commitments, per #3, and could be considered a kind of nestable framework, per #5.
The value of open models in driving decentralized use, testing, and research is obvious through a dynamist lens, and jibes with #1 and #4. (I do think there should be some precautionary friction before releasing frontier models openly, but that’s a narrow exception to the broader value of open source AI resources.)
Another good bet is differential technological development, aka defensive accelerationism—proactively building technologies that help manage challenges posed by other technologies—though I can’t easily map it onto Postrel’s five characteristics. I’d be glad to hear readers’ ideas for other productive directions to push in.”
At the NIH, Jay Bhattacharya did a lot to reduce animal experimentation and thus reduce animal suffering. As far as ChatGPT can tell, this seems to be completely ignored by the Effective Altruism forum.
Marty Makary’s FDA is also taking it’s steps to reduce the need of animal testing for FDA approvals.
Is this simply, because Effective Altruists don’t like the Trump administration so they can’t take the win of MAHA bringing contrarians into control of health policy that do things like caring more about reducing animal suffering and fighting the replication crisis?
I don’t think so.
Some less tribalistic hypotheses I can think of:
EAs concerned about animal welfare have typically focused on farmed animals, as opposed to animal testing, because of the much larger scale of the suffering
EAs mostly haven’t heard of it.
Maybe some EAs have heard about it, but they don’t think it is worth the effort to write a post about it.
But tribalistic explanations could be a factor too (e.g. MAHA has anti-science vibes, and EAs like to stay on the pro-science side).
(This is probably not the most constructive feedback, but my initial reaction to this short form was that it felt like a right-wing analog of left-wing “Why don’t the EAs tweet about Gaza?”-style criticisms).
I was a bit worried for the last 3 weeks that the Forum had gone quiet...
Then I come back after a 5 day Ugandan internet blackout and there are lots of fantastic front page posts great job everyone!!!
Well the blackouts are the only way to ensure a free & fair election Nick :)
true man long live the King!
Dwarkesh (of the famed podcast) recently posted a call for new guest scouts. Given how influential his podcast is likely to be in shaping discourse around transformative AI (among other important things), this seems worth flagging and applying for (at least, for students or early career researchers in bio, AI, history, econ, math, physics, AI that have a few extra hours a week).
The role is remote, pays ~$100/hour, and expects ~5–10 hours/week. He’s looking for people who are deeply plugged into a field (e.g. grad students, postdocs, or practitioners) with high taste. Beyond scouting guests, the role also involves helping assemble curricula so he can rapidly get up to speed before interviews.
More details are in the blog post; link to apply (due Jan 23 at 11:59pm PST).
This is a solid opportunity for people who already live inside a domain and enjoy synthesis more than spotlight. The pay reflects the expectation of taste and context, not just surface level research. Helping shape guest selection and prep indirectly shapes the conversation, which matters given the reach of the podcast. For the right grad student or practitioner, this is leverage and learning at the same time.
+1 I would love an EA to be working on this.
I’d be keen for great people to apply to the Deputy Director role ($180-210k/y, remote) at the Mirror Biology Dialogues Fund. I spoke a bit about mirror bacteria on the 80k podcast, James Smith also had a recent episode on it. I generally think this is among the most important roles in the biosecurity space and I’ve been working with the MBDF team for a while now and am impressed by what they’re getting done.
People might be surprised to hear that I put ballpark 1% p(doom) on mirror bacteria alone at the start of 2024. That risk has been cut substantially by the scientific consensus that has formed against building it since then, but there is some remaining risk that the boundaries are not drawn far enough from the brink that bad actors could access it. Having a great person in this role would help ensure a wider safety margin.
This role sounds important precisely because the risk is no longer theoretical but also not fully contained. Cutting risk through consensus helps, but it does not replace strong governance and clear red lines. A Deputy Director who understands both the technical details and the incentives of bad actors can close gaps that policy statements cannot. If mirror bacteria still sit close enough to misuse, staffing quality becomes a real safety control, not just an admin decision.
I notice the ‘guiding principles’ in the introductory essay on effectivealtruism.org have been changed. It used to list: prioritisation, impartial altruism, open truthseeking, and a collaborative spirit. It now lists: scope sensitivity, impartiality, scout mindset, and recognition of trade-offs.
As far as I’m aware, this change wasn’t signalled. I understand lots of work has been recently done to improve the messaging on effectivealtruism.org—which is great! -- but it feels a bit weird for ‘guiding principles’ to have been changed without any discussion or notice.
As far as I understand, back in 2017 a set of principles were chosen through a somewhat deliberative process, and then organisations were invited to endorse them. This feels like a more appropriate process for such a change.
I can’t speak for the choice of principles themselves, but can give some context on why the change was made in the intro essay (and clarify a mistake I made).
There are different versions of EA principles online. One version was CEA’s guiding principles you mention from 2017, and had endorsement from some other organisations. CEA added a new intro essay to effectivealtruism.org in 2022, with a different variation of a list of principles and Ben Todd as a main author: you can read the Forum post announcing the new essay here, and see the archived version here.
After Zach’s post outlining the set of principles that are core to CEA’s principles-first approach (that had existed for some time and been published on the CEA website, but not on effectivealtruism.org), we updated them in the intro essay for consistency. I also find Zach’s footnotehelpful context:
“This list of principles isn’t totally exhaustive. For example, CEA’s website lists a number of “other principles and tools” below these core four principles and “What is Effective Altruism?” lists principles like “collaborative spirit”, but many of them seem to be ancillary or downstream of the core principles. There are also other principles like integrity that seem both true and extremely important to me, but also seem to be less unique to EA compared to the four core principles (e.g. I think many other communities would also embrace integrity as a principle).”
I also want to say thanks to you (and @Kestrel🔸) for pointing out that collaborative spirit is no longer mentioned, that was actually a mistake! When we updated the principles in the essay we still wanted to reference collaborative spirit, but I left that paragraph out by mistake. I’ve now added it:
“It’s often possible to achieve more by working together, and doing this effectively requires high standards of honesty, integrity, and compassion. Effective altruism does not mean supporting ‘ends justify the means’ reasoning, but rather is about being a good citizen, while ambitiously working toward a better world.”
Hi @Agnes Stenlund 🔸 ,
Last week I had a discussion about the core principles with someone at our EA office in Amsterdam. She also liked “collaborative spirit”. I remembered this discussion and decided to check it again and see that you decided to add this in the intro essay. That’s great! Shouldn’t it then also be added on the “core principles” page? (Or am I overlooking something?)
Thanks for taking the time to provide this context!
Quick flag that the FAQ right below hasn’t been updated
Not sure how useful this is, and you mentioned you can’t speak for the choice of principles, but sharing on a personal note that the collaborative spirit value was one of the things I appreciated the most about EA when I first came across it.
I think that infighting is a major reason why EA and many similar movements achieve far less than they could. I really like when EA is a place where people with very different beliefs who prioritise very different projects can collaborate productively, and I think it’s a major reason for its success. It seems more unique/specific than acknodwledging tradeoffs, more important to have explicitly written as a core value to prevent the community from drifting away from it, and a great value proposition.
As James, I also found it weird that what had become a canonical definition of EA was changed without a heads-up to its community.
In any case, thank you so much for all your work, and I’m grateful that thanks to you it survives as a paragraph in the essay.
Thanks for putting it back!
It’s really important to me, as I can sometimes find that the (non-EA) charity and government world is a bunch of status-based competition over funding pots that encourages flattery and truth distortions and bitterness.
And, ok, EA can be like that as well, but ideally it isn’t—ideally we’d be totally happy for our pet project to get cancelled and the money reallocated to doing a similar thing more efficiently. And also to uphold the people this happens to, recognising their inherent worth as community members and collaborators.
I am glad to see the term “truthseeking” go. The problems with this term: 1) it has never been clearly defined by anyone anywhere, 2) people seem to disagree about what it means, and 3) the main way it seems to be used in practice on the EA Forum is as an accusation made against someone else — but due to (1) and (2), it’s typically not clear what, exactly, the accusation is. “Scout mindset” is much more clearly defined, so it’s a good replacement. (I don’t particularly love that term, personally, but that’s neither here nor there.)
Scope sensitivity seems like a good replacement for prioritization, no? I guess scope sensitivity and recognition of trade-offs together have replaced prioritization. That seems fine to me. What do you think?
Impartial altruism and impartiality sound like the same thing. So, that’s fine.
I think Kestrel is right that the only clear substantive change is collaborative spirit was dropped. Is that a good guiding principle? Could it also be substituted with something a bit clearer or better?
I don’t have a super strong view on which set of guiding principles is better—I just thought it was odd for them to be changed in this way.
If pushed, I prefer the old set, and a significant part of that preference stems from the amount of jargon in the new set. My ideal would perhaps be a combination of the old set and the 2017 set.
Expanding our moral circle
We work to overcome our natural tendency to care most about those closest to us. This means taking seriously the interests of distant strangers, future generations, and nonhuman animals—anyone whose wellbeing we can affect through our choices. We continuously question the boundaries we place around moral consideration, and we’re willing to help wherever we can do the most good, not just where helping feels most natural or comfortable.
Prioritisation
We do the hard work of choosing where to focus our limited time, money, and attention. This means being willing to say “this is good, but not the best use of marginal resources”—and actually following through, even when it means disappointing people or turning down appealing opportunities. We resist scope creep and don’t let personal preferences override our considered judgments about where we can have the most impact.
Scientific mindset
We treat our beliefs as hypotheses to be tested rather than conclusions to be defended. This means actively seeking disconfirming evidence, updating based on data, and maintaining genuine uncertainty about what we don’t yet know. We acknowledge the limits of our evidence, don’t oversell our findings, and follow arguments wherever they lead—even when the conclusions are uncomfortable or threaten projects we care about.
Openness
We take unusual ideas seriously and are willing to consider approaches that seem weird or unconventional if the reasoning is sound. We default to transparency about our reasoning, funding, mistakes, and internal debates. We make our work easy to scrutinise and critique, remain accessible to people from different backgrounds, and share knowledge rather than hoarding it. We normalise admitting when we get things wrong and create cultures where people can acknowledge mistakes without fear, while still maintaining accountability.
Acting with integrity
We align our behaviour with our stated values. This means being honest even when it’s costly, keeping our commitments, and treating people ethically regardless of their status or usefulness to our goals. How we conduct ourselves—especially toward those with less power—reflects our actual values more than our stated principles. We hold ourselves and our institutions to high standards of personal and professional conduct, recognising that being trustworthy is foundational to everything else.
Wow, I like scientific mindset a lot more than “truthseeking” (what does it mean??) or scout mindset!
I think you are right that there is too much jargon in the new set of principles and the old set is much nicer.
I also agree there should probably be a consultation with the community on this.
...where’d the collaborative spirit go? The rest is mostly relabeling, so I’d let it slide, but that does seem like a glaring omission. Did EAs helping each other not poll well in a non-EA focus group or something?
I agree and argued in a similar direction in a comment last year.