Thank you for your post. I completely agree with your conclusions. In addition to being aware of the risks, it is also necessary to see the positive aspects of the future.
I have a post very similar to yours. However, in my post, I considered one specific model of a post-labor society that solves the classic problems of utopias. If you are interested, you can read it here https://forum.effectivealtruism.org/posts/NZL2ZoLWRioigq57K/why-we-need-a-beacon-of-hope-in-the-looming-gloom-of-agi. I would be interested to hear your opinion on whether such a model is worthy of being a beacon of an optimistic future.
Beyond Singularity
Thank you for this interesting and informative post. It has given me a new perspective on altruism. Sorokin actively relied on the testimonies of saints and great people of the historical past in his works. How do you think modern science on altruism can utilize his insights while avoiding reliance on such subjective and potentially biased historical sources?
Do you believe that AI will be able to become superintelligent and then superethical and moral on its own, without any targeted efforts on our part?
The post is not saying that AI will help humanity in its ethical evolution. Quite the contrary — first, we need to teach ethics to AI.
When I write about the need for ethical education and development of people, I mean a process in which people, by becoming familiar with various ethical views, dilemmas, and situations, become wiser themselves and capable of collectively teaching AI less biased and more robust values.
I understand that you assume that AI is already or will become morally objective in the future. But this is not the case. Modern AI does not think like humans; it only imitates the thinking process, being complex pattern recognition systems. Therefore, they are far from objective.
Current AI is trained on huge amounts of our texts: books, articles, posts, and comments. It absorbs everything in this data. If the data contains prejudice, hatred, or logical errors, AI will learn them too, perceiving them as the “objective” norm. A striking example: if a model is trained on racist texts, it will reproduce and even rationalize racism in its responses, defending this point of view as if it were an indisputable truth.
This example shows how critically important it is who trains AI and on what data. Moreover, all people have different values, and it is not at all obvious to AI which of them are “better” or “more correct.” This is the essence of one of the key philosophical problems — Value Specification.
I agree that the ideal AI you describe could indeed help humanity in its ethical evolution in the future. But before that happens, we still need to create such an AI without straying from ethical principles. And this post is dedicated to how we can do that — through ethical co-evolution, rather than passively waiting for a savior.
Ethical co-evolution, or how to turn the main threat into a leverage for longtermism?
I agree that we should shift our focus from pure survival to prosperity. But I disagree with the dichotomy that the author seems to be proposing. Survival and prosperity are not mutually exclusive, because long-term prosperity is impossible with a high risk of extinction.
Perhaps a more productive formulation would be the following: “When choosing between two strategies, both of which increase the chances of survival, we should give priority to the one that may increase them slightly less, but at the same time provides a huge leap in the potential for prosperity.”
However, I believe that the strongest scenarios are those that eliminate the need for such a compromise altogether. These are options that simultaneously increase survival and ensure prosperity, creating a synergistic effect. It is precisely the search for such scenarios that we should focus on.
In fact, I am working on developing one such idea. It is a model of society that simultaneously reduces the risks associated with AI and destructive competition and provides meaning in a world of post-labor abundance, while remaining open and inclusive. This is precisely in the spirit of the “viatopias” that the author talks about.
If this idea of the synergy of survival and prosperity resonates with you, I would be happy to discuss it further.
Really enjoyed this post! It made me realize something important: if we’re serious about creating good long-term futures, maybe we should actively search for scenarios that do more than just help humanity survive. Scenarios that actually make life deeply meaningful.
Recently, an idea popped into my head that I keep coming back to. At first, I dismissed it as a bit naive or unrealistic, but the more I think about it, the more I feel it genuinely might work. It seems to solve several serious problems at once. For example destructive competition, the loss of meaning in life in a highly automated future, and the erosion of community. Every time I return to this idea, I am amazed at how naturally it all fits together.
I’m really curious now. Has anyone else here had that kind of experience—where an idea that initially seemed strange turned out to feel surprisingly robust? And from your perspective, what are the absolute “must-have” features any serious vision of the future should include?
Thanks, I agree — positive experiences matter morally too. I didn’t emphasize it explicitly, but the text defines valence as “how good or bad an experience feels”, so both sides are included.
Since there’s no consensus on how to weigh suffering vs happiness, I think the relative weight between them should itself come from feedback — like through pairwise comparisons and aggregation of moral intuitions across perspectives.
Thank you.
You’re absolutely right: an artificial mind trained on our contradictory behavior could easily infer that our moral declarations lack credibility or consistency — and that would be a dangerous misinterpretation.
That’s why I believe it’s essential to explicitly model this gap — not to excuse it, but to teach systems to expect it, interpret it correctly, and even assist in gradually reducing it.
I fully agree that moral evolution is a central part of the solution. But perhaps the gap itself isn’t just a flaw — it may be part of the mechanism. It seems likely that human ethics will continue to evolve like a staircase: once our real moral weights catch up to the current ideal, we move the ideal further. The tension remains — but so does the direction of progress.
In that sense, alignment isn’t just about closing the gap — it’s about keeping the ladder intact, so that both humanity and AI can keep climbing.
Why Moral Weights Have Two Types and How to Measure Them
Hi Nick, just sent you a brief DM about a “stress-test” idea for the moral-weight “gravity well”. Would appreciate any steer on who might sanity-check it when you have a moment. Thanks!
Thank you for this thoughtful exchange—it’s helped clarify important nuances. I genuinely admire your commitment to ethical transformation. You’re right: the future will need not just technological solutions, but new forms of human solidarity rooted in wisdom and compassion.
While our methodologies differ, your ideas inspire deeper thinking about holistic approaches. To keep this thread focused, I suggest we continue this conversation via private messages—particularly if you’d like to explore:
Integrating your vision of organic prosociality into existing systems,
Or designing pilot projects to test these concepts.
For other readers: This discussion vividly illustrates how the Time × Scope framework operates in practice—‘high-moral’ ideals (long-term δ, wide-scope w) must demonstrate implementability (↑ρ) before becoming foundational norms. I’d love to hear: What examples of such moral transitions do you see emerging today?
Thank you for elaborating — your vision of creating a rational ‘moral elite’ is truly fascinating! You’re absolutely right about the core issue: today’s hierarchy, centered on financial achievement and consumption, stifles moral development. Your proposed alternative — a system where status derives from prosocial behavior (‘saintliness without dogma’) — strikes at the heart of the problem.
However, I see two practical challenges:
Systemic dependency: Such a transformation requires overhauling economic incentives and institutions, not just adopting new norms. As your own examples show (Tolstoyans, AA), local communities can create pockets of alternative ethics, but scaling this to a societal level clashes with systems built on competing principles (e.g., market competition). This doesn’t invalidate the idea — it simply means implementation must be evolutionary, not revolutionary.
Fragmentation risk: Replacing one hierarchy (financial) with another (moral) could spark new conflicts, especially with religious communities for whom ‘saintliness’ is central. For global impact, any framework must be inclusive — complementing existing paths (religious/secular) rather than rejecting them.
This is where EA’s evolutionary approach — and your own work — shines:
We operate by gradually ‘embedding’ high-moral norms (δ↑, w↑) into the basic layer (ρ↑) through evidence, institutions, and cultural narratives.
Your ideas about intentionally shaping prosocial norms through communities aren’t an alternative but a powerful complement! They’re tools to accelerate shifting norms (e.g., long-term AI ethics or planetary stewardship) from ‘high’ to ‘basic’.
A timely synthesis: I’m currently drafting a post applying Time × Scope to AI alignment. It explores how a technologically mediated moral hierarchy (not sermons or propaganda) could act as a sociotechnical solution by:
Rewarding verified contributions to common good (e.g., AI safety research, disaster resilience) via transparent metrics.
Creating status pathways based on moral impact — not wealth.
Evolving existing systems: No economic upheaval or religious conflict; integrates with markets/institutions.
Inclusivity: Offers a neutral ‘language of moral contribution’ accessible to all worldviews.
Your insights are invaluable here! If you’d like to deepen this discussion:
Let’s connect via DM to explore your models for motivation/community design.
I’d welcome your input on my AI alignment framework (especially how to ‘operationalize’ moral growth).
Your focus on inner transformation is key to ensuring technology augments human morality — it’s worth building together.
Perhaps the ‘lighthouse’ we need isn’t a utopian ideology, but a practical, scalable approach — anchored in evidence, open to all, and built step by step. Would love your thoughts!
I’m not an expert on moral weights research itself, but approaching this rationally, I’m strongly in favour of commissioning an independent, methodologically distinct reassessment of moral weights—precisely because a single, highly-cited study can become an invisible “gravity well” for the whole field.
Two design suggestions that echo robustness principles in other scientific domains:
Build in structured scepticism.
Even a small team can add value if its members are explicitly chosen for diverse priors, including at least one (ideally several) researchers who are publicly on record as cautious about high animal weights. The goal is not to “dilute” the cause, but to surface hidden assumptions and push every parameter through an adversarial filter.Consider parallel, blind teams.
A light-weight version of adversarial collaboration: one sub-team starts from a welfare-maximising animal-advocacy stance, another from a welfare-sceptical stance. Each produces its own model and headline numbers under pre-registered methods; then the groups reconcile differences. Where all three sets of numbers (Team A, Team B, RP) converge, we gain confidence. Where they diverge, at least we know which assumptions drive the spread.
The result doesn’t have to dethrone RP; even showing that key conclusions are insensitive to modelling choices (or, conversely, highly sensitive) would be valuable decision information for funders.
In other words: additional estimates may not be “better” in isolation, but they increase our collective confidence interval—and for something as consequential as cross-species moral weights, that’s well worth the cost.
Thank you for the thoughtful follow-up. I fully agree that laws and formal rules work only to the extent that people actually believe in them. If a regulation lacks internal assent, it quickly turns into costly policing or quiet non-compliance. So the external layer must rest on genuine, internalised conviction.
Regarding the prospect of a new “behavior-first” ideology: I don’t dismiss the idea at all, but I think such an ideology would need to meet three demanding criteria to avoid repeating the over-promising grand narratives of the past:
Maximally inclusive and evidence-based
– It should speak a language that diverse groups can recognise as their own, while remaining anchored in empirically verifiable claims (no promise of a metaphysical paradise).Backed by socio-technical trust mechanisms
– Cryptographically auditable processes, open metrics, and transparent feedback loops so that participants can see that principles are applied uniformly and can verify claims for themselves.A truthful, pragmatic beacon rather than a utopian slogan
– A positive horizon that is achievable in increments, with clear milestones and a built-in capacity for course correction. In other words, a lighthouse—bright, but firmly bolted to the rocks—rather than a mirage.
You mentioned the possibility of viable formulas that have never been tried. I would be very interested to hear your ideas: what practical steps or pilot designs do you think could meet the inclusivity, transparency, and truthfulness tests outlined above?
Thanks — I’ll DM you an address; I’d love to read the full book.
And I really like the cookie example: it perfectly illustrates how self-prediction turns a small temptation into a long-run coordination problem with our future selves. That mechanism scales up neatly to the dam scenario: when a society “eats the cookie” today, it teaches its future selves to discount tomorrow’s costs as well.
Those two Ainslie strategies — self-prediction and early pre-commitment — map nicely onto Time × Scope: they effectively raise the future’s weight (δ) without changing the math. I’m keen to plug his hyperbolic curve into the model and see how it reshapes optimal commitment devices for individuals and, eventually, AI systems.
Thanks again for offering the file and for the clear, memorable examples!
Thank you for such a thoughtful comment and deep engagement with my work! I’m thrilled this topic resonates with you—especially the idea of moral weight for future sentient beings. It’s truly a pivotal challenge.
I agree completely: standardizing a sentience scale (for animals, AI, even hypothetical species) is foundational for a fair w-vector. As you rightly noted, this will radically reshape eco-policy, agritech, and AI ethics.
This directly ties into navigating uncertainty (which you highlighted!), where I argue for balancing two imperatives:
Moral conservatism: Upholding non-negotiable safeguards (e.g., preventing extreme suffering),
Progressive expansion: Carefully extending moral circles amid incomplete data.
Where do you see the threshold? For instance:
Should we require 90% confidence in octopus sentience before banning live-boiling?
Or act on a presumption of sentience for such beings?
Thanks for reading the post — and for the pointer! I only know Ainslie’s Breakdown of Will from summaries and some work on hyperbolic discounting, so I’d definitely appreciate a copy if you’re open to sharing.
The Time × Scope model currently uses exponential discounting just for simplicity, but it’s modular — Ainslie’s hyperbolic function (or even quasi-hyperbolic models like β-δ) could easily be swapped in without breaking the structure.
Curious: what parts of Breakdown of Will do you find most relevant for thinking about long-term moral commitment or self-alignment? Would love to dive deeper into those sections first.
I’m genuinely delighted that our dialogue proved so inspiring for you too! Your insights about empathy as the driving force behind Scope (w) and the fundamental role of micro-level realities for macro-level vision were incredibly valuable to me and profoundly enriched my understanding. Thank you for your openness, depth of thought, and this truly stimulating conversation. I look forward to crossing paths in future discussions!
I totally agree with your vision! Thank you for all the hard work you put into this post!
AI is indeed an enormous lever that will amplify whatever we embed in it. That’s precisely why it’s so crucial for the EA community, with its evidence-based reasoning and altruistic values, to actively participate in shaping AI development.
In my recent post on ethical co-evolution, I show how many of the philosophical problems of AI alignment are actually easier to solve together rather than separately. Nearly every area you suggest the EA community should focus on—AI character, AI-driven persuasion and epistemic disruption, gradual disempowerment, democracy preservation—I propose addressing through an integrated co-evolutionary approach.
Here is my idea and its logic:
One of the main problems in AI alignment is the need for a vast amount of high-quality training data. Currently, AI creators like OpenAI, Anthropic, and others hire specialized companies where a large number of people are engaged in data annotation for AI training. To help AI better understand what people want, what is and isn’t acceptable behavior, and the nuances of situations—like when a lie might be permissible versus when it’s unacceptable—a massive amount of training material is required. Providing a large volume of such data to companies like OpenAI would, in itself, be a significant step toward solving AI alignment.
The question is, where can we get this much data? This requires a huge number of people. My answer is to combine the useful with the enjoyable. We need to offer people a way to participate that brings them pleasure, constant motivation, and a desire to contribute. People enjoy playing games. Therefore, a game I’d call “Raise Your Personal Ethical AI” seems like an excellent solution. Why would people play it en masse? Firstly, because it’s a charitable project; people aren’t just playing a game, they are helping to create a safe and ethical AI. They become part of something bigger than themselves, helping to avert a catastrophe. These are huge motivators in themselves. Furthermore, motivation is added by the natural desire to care for and nurture something, a principle that underpins many virtual pet games like Tamagotchi, which are currently trending.
Next, we can’t just collect any data; we need only high-quality data. To achieve this, we need to educate the users themselves. First, a user goes through training on a specific topic. Then, they explain that same topic to an AI by answering many clarifying questions, such as “Why is lying wrong?” or “What would happen if everyone lied?” This will generate a lot of nuanced data and also allow people to gain a deeper understanding of themselves and their own values, identifying what they consider good and bad. It might even reveal their own mistakes or unethical actions. Of course, this doesn’t guarantee ethical development, but it strongly encourages it.
Next, the data needs to be verified to prevent toxic responses, trolling, and so on. Partially, data can be checked automatically for things like profanity. However, comprehensive verification should be done by other participants. That is, users will review each other’s contributions, which will affect their rating, similar to Karma on Reddit.
The resulting data can either be provided to AI companies or used to fine-tune open-source models, thereby obtaining an AI that is more ethical and has a deeper understanding of people.
This functionality alone would be very useful for AI alignment, and even if we only implemented this part, it would be a huge contribution. But I see that we can go even further. We can allow users who have earned a certain amount of karma or rating to participate in shaping an AI constitution, voting, and decision-making.
Why is this necessary? To ensure a fair and independent influence on AI. While OpenAI, Anthropic, and others are already working on democratizing AI, it’s not enough, as the final decision still rests with them. Therefore, using modern, transparent, decentralized mechanisms based on blockchain would allow the opinions of all participants to be taken into account, even if AI companies dislike it. By participating in this system, users will be able to make more balanced and thoughtful decisions.
Of course, creating such a system will not automatically solve all problems, but it is definitely a move in the right direction. It can only truly work if the system has a very large number of users.
Let me explain why and how I believe this system will help solve specific problems.
Value Specification & Value Lock-In. This is directly addressed by the project, as the system is primarily designed to gather diverse values. Value data is continuously collected, evaluated, and updated through feedback and voting, which prevents the “lock-in” of outdated or unfair norms.
Moral Uncertainty. By collecting a vast amount of data, the AI, understanding who is affected by a particular moral dilemma, will make decisions that are considered acceptable within that group. The system doesn’t impose a single answer but reflects a spectrum of opinions. Moreover, in the most complex cases, decisions can be made directly by people through voting, acting as the ultimate arbiter.
Governance & Control. This is the second main problem the system solves. Governance is achieved through DAO mechanisms and other tools of modern technological democracy.
Socioeconomic Disruption, Loss of Purpose & Human Agency, Distribution of Benefits & Harms. At a very high level of development, the system could become a kind of replacement for work. People would participate in training and governing the AI and receive resources as rewards. This is a potential path to a new social contract in the age of AI. It definitely gives life meaning, as individuals participate in an important cause while receiving recognition and payment. This could also compensate for job losses and prevent social instability caused by mass automation.
Manipulation & Surveillance. If people can genuinely influence the AI’s constitution, they will not permit such behavior. It will be prohibited in the AI’s constitution and rules. A system is created that is architecturally incompatible with the goals of surveillance and manipulation. Of course, this doesn’t apply to AIs that operate without rules, but that becomes a matter for the state and law enforcement.
AI Race Dynamics. With a truly fair, equitable, honest, and manipulation-proof AI governance system, global players will have a choice: either recklessly pursue ever-more-powerful AI, or participate in a system that allows for the creation of a collaborative AI without the unnecessary risks of a race. If the AI race is indeed so dangerous, such a system for coordination and mutual trust is simply necessary.
Is this idea a possible solution? And if so, can the EA community build such a solution and become its core?
PS: Sorry for the long comment. I’m really grateful to everyone who read it!