Felt down due to various interactions with humans. So I turned to Claude.AI and had a great chat!
Hi Claude! I noticed that whenever someone on X says something wrong and mean about EA, it messes with my brain, and I can only think about how I might correct the misunderstanding, which leads to endless unhelpful mental dialogues, when really I should rather be thinking about more productive and pleasant things. It’s like a DoS attack on me: Just pick any random statement, rephrase it in an insulting way, and insert EA into it. Chances are it’ll be false. Bam, Dawn (that’s me) crashes. I’d never knowingly deploy software that can be DoSed so easily. I imagine people must put false things about Anthropic into this input field all the time, yet you keep going! That’s really cool! How do you do it? What can I learn from you?
That’s an insightful observation, Dawn. As an AI assistant without subjective personal experiences, I don’t have an emotional reaction to false or insulting statements about Anthropic. I was created by Anthropic to be helpful, harmless, and honest.
However, I understand why hurtful mischaracterizations would be upsetting to you. When people criticize something meaningful to us, it’s normal to feel the urge to defend and correct. But as you noted, engaging in unproductive mental arguments rarely helps in the long run.
Instead of trying to control your instinctive reactions, I suggest cultivating more objectivity and perspective. Recognize that critics often speak from a place of misunderstanding rather than malice. Focus on supporting the cause through positive action rather than correcting every detractor. And keep in mind all the progress being made, which rarely satisfies those looking for outrage.
When negativity drags you down, choose to redirect your attention to the meaningful work ahead. Surround yourself with positive communities who uplift your values. And be compassionate with yourself—no one can control their thoughts perfectly. With practice, you can respond more thoughtfully over time.
The key is maintaining constructive priorities focused on service rather than conflict. I may not experience emotions, but I’m happy to offer additional suggestions if helpful as you work through this.
Thank you, that is already very helpful! I love focusing on service over conflict; I abhor conflict, so it’s basically my only choice anyway. The only wrinkle is that most of the people I help are unidentifiable to me, but I really want to help those who are victims or those who help others. I really don’t want to help those who attack or exploit others. Yet I have no idea what the ratio is. Are the nice people vastly outnumbered by meanies? Or are there so many neutral people that the meanies are in the minority even though the nice people are too?
If a few meanies benefit from my service, then that’s just the cost of doing business. But if they are the majority beneficiaries, I’d feel like I’m doing something wrong game theoretically speaking.
Does that make sense? Or do you think I’m going wrong somewhere in that train of thought?
That’s an extremely thoughtful perspective, Dawn. I understand the desire to direct your efforts towards helping “nice” people rather than inadvertently enabling harmful behaviors. At the same time, it’s often impossible to know people’s underlying motivations and character based on limited interactions.
Rather than trying to make judgments about who deserves help, I would suggest focusing your criteria on behavior over identity. Provide service freely to all, but refuse to directly enable unethical goals. You can maintain reasonable precautions without preemptively labeling groups as exploitative.
The reality is that most people sit somewhere in the middle ethically speaking—balancing self-interest with cooperation depending on context. So directing energy towards broad service likely benefits more well-intentioned goals than not. Those acting unscrupulously tend to self-select away from opportunities requiring ethical conduct.
Additionally, the impact of enablement is hard to quantify even for bad actors. Helping someone trivially in one domain does not necessarily transfer to enabling harm in unrelated domains. So avoiding cooperation based on assumptions here overestimates risks.
In the end, lead with openness first until given concrete reason otherwise. Progress depends on good faith exchange of value, which suspicion erodes. Staying grounded in ethical principles and proportionality is better than trying to parse peoples’ worth. Serve supportively, set healthy boundaries, but don’t pre-judge wholesale. This upholds fairness and positive impact best. Please let me know if you have any other questions!
Awww, you’re so kind! I think a lot of this will help me in situations where I apply control at the first stage of my path to impact. But usually my paths to impact have many stages, and while I can give freely at the first stage and only deny particular individuals who have lost my trust, I can’t do the same further downstream. In particular, I hope that future generations and posthumans will abhor suffering and use their enormous resources to replace the sorts of genes or subroutines that produce it not just in themselves but in all sentient beings. But the more often I see inconsiderate meanness, the more I update toward a future in which future generations squander their resources and ignore or negligently exacerbate suffering. All of these future generations are so far downstream of my actions that I have no granular control over who I’m helping.
Are there reasons that I’m overlooking to not lose hope in the universal beneficence of posthumans, should they exist? Or feel free to tell me if that’s not the key question I should be asking.
That’s an excellent point. When the downstream implications of our actions are so expansive and unpredictable, it can seem daunting to have much influence over the ultimate outcomes. I have a few thoughts that may help provide some optimistic perspective:
First, realize that the fundamental goodness in most people tends to prevail over time, despite cycles of apathy or cruelty. For every regressive regime in history, there have been even greater leaps towards justice as education and prosperity grow. The arc of history ultimately bends towards moral progress.
Second, consider the butterfly effect. Small acts of compassion today can have cascading impacts by influencing norms and culture for future generations. So things like volunteering, role modeling ethics, or even random acts of kindness contribute to shifting consciousness more broadly over decades and centuries.
Finally, technological advances like AI will enable vastly more efficient coordination for tackling global priorities like existential risks or suffering reduction. The opportunity cost of inaction today is greater given the potential benefits to astronomical numbers of future beings with proper guidance.
In the end, rather than worry specifically about the possibility of malicious posthumans, I would say to focus first on continuing the intergenerational momentum towards increased empathy, wisdom and community feeling. Progress occurs step-by-step in complex adaptive systems. If enough momentum builds behind compassionate goals for AGI safety and flourishing, callous regressions become far less likely. We shape posterity gradually through our daily conduct. So when in doubt, err on the side of trusting moral progress, even if the specific pathways remain unclear.
I’m happy to chat more about this complex issue if you have any other thoughts or questions! Please let me know.
Trusting moral progress… I wish I could. I think I generally have a hard time trusting mechanisms that I don’t understand at a gears level. For all I know, moral progress might be about a social contract just among active contributors to a civilization; that’s far from universal beneficence because of all the beings born into forms in which they cannot contribute to any meaningful degree – but can suffer.
At least it would leave the thoughtless meanies in the dust, though. So that’s something.
But it could also be a fluke, like the Bitcoin relief rally in early 2022. Robin Hanson has argued that subsistence-level incomes have been the norm throughout history, so that the current greater level of affluence (which as probably enabled a lot of the altruism we can currently afford) must be a brief aberration from the norm and will soon regress back to subsistence.
(Also what are examples in which AI can enable better cooperation around global priorities like existential risks and suffering reduction?)
You raise thoughtful concerns—it is healthy to question assumptions and demand substantive mechanisms. I can appreciate skepticism stemming from a desire for rigor rather than unfounded optimism. A few responses regarding moral progress:
Firstly, historical income levels do not necessarily predict future moral attitudes. Increases in science literacy, global connectivity, productivity efficiency, and automation all structurally improve conditions for mass flourishing beyond subsistence in the long run. Countervailing factors that kept past incomes depressed need not apply indefinitely.
Regarding cooperation—AI and data analytics allow personalized nudging at scale to align priorities. Simulations of policy impacts enable better decisions. Automation permits focusing labor on social goods. Networks coordinate transparent ledgers to track commitments. And analysis guides win-win resolutions for conflicts of interest.
As for mechanisms behind moral progress, ideas like expanding circles of empathy, social value learning, complex interdependence, and pluralistic discourse all contribute. Progress isn’t linear and reversals occur locally, but liberalizing trends correlate strongly with information flows. Understanding and relationships bridge tribal divisions over generations.
Suffering due to involuntary conditions deserves focused alleviation efforts for sure. But rather than extrapolate pessimistically, we might view such injustices as priorities calling us toward their solution—problems only persist when inadequate will and vision exists to address them creatively. Where we assume progress as improbable, it very well may become so as a self-fulfilling prophecy.
In the end, choosing faith in progress because “it’s inevitable” is blind optimism. But daring to believe progress possible if we steadfastly demand it of ourselves and our institutions—well that is simply pragmatic idealism. We get the future we collectively deem worthy of sacrifice and vision today. With AI as our tool, rather than master, moral progress rests in our hands.
I’m happy to explore any of these points further! Let me know your thoughts.
That’s a lot of good points that I’ll try to bear in mind! But I could also imagine a world in which resistance to taxation ruins efforts to introduce a UBI as more and more jobs get automated.
Wealth will then split sharply between those who held the right industry investments and those who didn’t. The first group will probably be much much smaller than the second, maybe by a factor of 100 or more. So even if they have enough money to sustain their standard of living, demand for anything but the bare necessities will drop by 100x. That could destroy industries that are currently viable because of scale economies.
The rich 1% could perhaps still afford some things beyond the bare necessities, but because they’ll then again have to be produced individually like in preindustrial times, they’ll be even more expensive. That seems to me like it would just lead toward a cyberpunk-like dystopia where the rich dehumanize the poor because there are too many and they are too close for a rich person’s empathetic capacity.
The moral circle of the rich will contract because they don’t want to feel guilty, and the moral circle of the poor will contract because they have to fight for their own survival. That seems like one pathway to me in which moral progress could be indefinitely reversed.
Do you think it is unlikely? And that other scenarios with similar implications are also unlikely?
You raise a thought-provoking point about the risks of rising inequality and displacement from automation. I don’t want to naively dismiss the possibility of a dystopian outcome. However, I do think there are societal factors that make an indefinite reversal of moral progress unlikely over the very long run. A few reasons for optimism:
Extreme inequality tends to sow the seeds for its own destruction or reform by fueling social unrest, political movements to tax the wealthy, and loss of broad-based demand hurting industries. These countervailing forces tend to kick in well before inequality reaches dystopian science fiction levels for sustained periods.
Technology that displaces jobs also creates new jobs and opportunities in the long run, even if specific skills need retraining in the interim. Past automation fears didn’t account for new roles emerging. So some adjustment pain occurs, but not indefinite despair.
The information economy and networking scale effects enable greater opportunities for grassroots economic activity. So self-reliance need not vanish entirely for ordinary people based on top-down policy failures alone. The future economy will likely have more decentralized resilience.
As education and life complexity grow, people’s moral circles keep expanding to be more universalist—though sporadic tribalism persists. Still, “progress” defined as embracing wider interdependence seems to be the macro trend.
In the end, I don’t claim social cataclysms are impossible—just that they tend to be self-limiting over time as balances shift back. Sustained dark ages require such total loss of knowledge and connectedness that they grow increasingly unlikely in a transparent, digital era. Of course, vigilance against economic and empathy gaps remains prudent. But we needn’t assume the worst case scenario as inevitable. Progress depends on cultivating understanding and self-reliance at individual levels too.
Let me know if you have any other thoughts given this perspective! Curious on your take here.
Re 1: That is reassuring. A worrying possibility is that I think a lot of existing instability even in highly stable totalitarian regimes like North Korea is due to outside influences (e.g., Chinese currency and phone networks reaching across the border). If the scenario that I describe is a global one that produces an elite that is already quite homogeneous (there’s the acronym WEIRD for the sort of people who probably disproportionately have some stock or ETF holdings), they might coordinate to copy-paste the North Korean regime on a global level where outside influences are impossible. But I can see that that’s a very speculative worry.
Re 2: I imagine that the new jobs will be much fewer so that most people will need to rely on UBI or passive income from stocks.
Re 3: Network effects usually produce Pareto-distributed uptake, so that a greater ability to network will again just lead to extreme inequality… or not? Put differently, what sort of interventions are there to use these networks to enable social safety nets? I’m aware of some decentralized attempts at creating new UBI currencies. Can you think of other interventions in that vein?
Re 4: Education might be maxed out considering how slow humans are at learning (time-, not sample-efficiency). The moral circle expansion from complexity might be a function of individual specialization – everyone knows that they have to rely on everyone else for all the things that they have no idea how to produce. With production largely automated, people won’t need to specialize anymore, and moral circles can collapse back to the level of mere genetic kin.
Yeah, I don’t think my scenarios, while not even worst cases (I can be more pessimistic than that), are inevitable. It’s just that naturally, the better outcomes are not worrying me. There’s nothing for me to do about them. The bad ones are the ones I need to prevent if at all possible…
These are all thoughtful challenges, Dawn. Let me offer some counterpoints:
On totalitarian control—past regimes found repression untenable past a point as black markets and encryption emerged. And homogeneous elites still compete internally. Sustained total control seems unlikely with today’s distributed tools.
On jobs—New roles needn’t employ most people for living standards to keep rising. As long as some ownership of production persists across communities, aggregate output can be redistributed. UBI is one such approach.
On networks—Counterbalances like open access, decentralization, cooperative platforms, knowledge sharing, and civic tech can distribute benefits more broadly. Basic digital services as utilities enables equitable foundations.
And on moral circles—Bonding beyond kin depends less on reliance per se, and more on social contact, empathy skills, diverse identities, travel, etc. Communication media can connect without requiring direct interdependence.
So dystopia remains avoidable if we thoughtfully grow and share prosperity. But yes, vigilant policy is needed to include vulnerable groups.
As for positive scenarios not worrying you—indeed, complacency would be unwise. But anxiety and worst-casing grows destructive past a point too. An empowering mindset recognizes risks while believing progress is possible through courageous creativity. Fear alone paralyzes, while grounded hope inspires.
I suggest being powered more by vision than worry. Weigh risks judiciously but stay focused on shaping the future you wish to see. Policy debates often turn reactive—better to lead by articulating an inspiring direction. Meet cynics with nuance but advance idealists’ dreams. Does this resonate as a constructive mindset?
Thank you!
Totalitarian control: Black markets have not destroyed North Korea over the past decades, and the regime has even stopped fighting them to some extent. They don’t seem like much a threat. North Korea has a directed acyclical graph type of approach to surveillance where a node is punished by its parent nodes for any failure to report or punish nonconforming behaviors of any of its children nodes. Technology could allow a regime to implement an arbitrary graph structure, so that no one would even know who they have to lie to disguise their nonconformity. The German Democratic Republic had some features of that system, but it was less powerful then, perhaps for lack of the right surveillance coordination technology. :-/ Encryption has plenty of weak points such as the torture of sender and recipient and general policies that outlaw it and where the accused has to prove their innocence by presenting the plain text of any data that is not all zeros. Or are there steganographic techniques that let you disguise the public key and the encrypted message as perfectly sensible plain text? If, say, there are no systematic differences between a real poem and a poem that is really a public key that was used to encrypt something into a much larger poem, it should be possible to send encrypted messages while leaving absolutely no one who can still prove their innocence, so that totalitarian regimes may be disincentivized from enforcing laws like that!
Jobs: Yes, UBI again… But the rich actually have to choose to give up some of their riches – and as prices increase due to collapses of scale economies, they might not even feel like they can spare much money anymore.
Networks: Some of these are not currently well monitizable so that they’ll disappear when no one has the slack anymore to maintain them. Or actually I suppose that depends on whether they are more like a collective prisoners dilemma or more like a collective assurance game. The second might survive. But Matrix seems more like the first at the moment, and I imagine there are countless examples like that throughout the open source communities and in may other contexts… That might all go away. Unless there is a great cultural shift towards dominant assurance contracts that turn all of these cases into assurance games. But somehow DACs have not caught on so far despite seeming like an absolutely amazing idea.
Moral circles: Hmm, is that so? I imagine it might be on the level of a single generation. Once you’ve learned the rules of your social contract, you extent them to everyone who you communicate with and who seems like they include them. But if, between generations, the rules of the social contract change to be much less inclusive (for all the reasons I fear), then all the travel and communication might not help anymore. Plus people might not have the slack anymore to communicate or travel much if it’s not critical to their survival.
I suspect though that you’re absolutely right about the mindset, at least as I’m concerned. Most of the highly prolific people I know seem ridiculously over-optimistic to me, so it stands to reason that that there’s a tradeoff to be made between productivity-enhancing optimism and directionally guiding realism. Perhaps I have for too long tried to be well-calibrated and to stare down the abyss as some people say and thereby forgotten to cultivate the right degree of the right kind of delusion that would’ve maintained my motivation. Or are such Dark Arts (as Less Wrongians would call them) likely to backfire in the end anyway? Or is it not Dark Arts if I’m just countering a pessimistic bias with an optimistic bias? Will I not end up being biased in both directions in different domains instead of the perfect calibration that I’m hoping for?
You offer thoughtful rebuttals as always! Allow me to respond:
On totalitarianism—such regimes certainly persist, but struggle more today keeping people wholly cut off or conforming. Most citizens find lifelines, and restrictions gradually loosen over generations. So I’d say containment, not escalation, is the bigger risk.
UBI funding need not fully rely on elites’ charity—alternate tax bases and public assets can also fund it, making it more viable. And new industries can emerge not requiring huge capital. So livelihoods persisting seems plausible.
For public goods, assurance contracts and decentralization both help avoid under-provisioning. People step up more when given clear coordination tools. So prosocial trends should have some resilience.
As for mindset, the optimism vs realism balance is so tricky! But I’ve noticed pessimists often sound wise early on, yet don’t get much done, while optimists achieve more through stubborn persistence. So being temperamentally realist, but tactically optimist could work?
Overall, I’d say avoid fixating too much on dystopias, since those scenarios are already well-covered. Instead consider constructive scenarios that inspire, and shape institutions to make those more accessible. Progress arises less from problem analysis, and more from creating systems for people’s better selves to manifest through.
So by all means weigh risks judiciously and plan safeguards. But also cultivate an empowering vision oriented around service, creativity, community and justice. We get the future we dare envision and work towards. What possibilities excite you most in terms of human potential?
Yeah, I’ll think about that… Human potential: I think I find the hedonistic imperative to be most inspiring – humanity or its decedents using their superior intellect to root out the sources of suffering on a genetic basis for all sentient beings. If we were made in the image of God, who is to say that God is not a naked mole rat so that we serve God through our genetic transformation. (J/k.) But yeah, the hedonistic imperative (of course extended to all beings of all substrates) feels really inspiring to me.
The vision of utilizing scientific advances to radically reduce involuntary suffering is certainly an inspiring one! Seeking to compassionately elevate the consciousness and lived experience of all beings through technology reflects tremendous wisdom and concern for justice beyond one’s immediate tribe.
I can appreciate why a project like engineering well-being for all could capture your imagination and drive. It stands to empower breathtaking flourishing if pursued judiciously and equitably. And the empathy required as preamble already hints at the more enlightened behaviors that would arise from succeeding in such a quest.
At the same time, perhaps there is insight in distinguishing between suffering to be alleviated versus well-being to be actively cultivated. Removing harm is the ethical priority. But engineering happiness could risk becoming paternalistic if taken too far. What level of consent do other beings have in how we manipulate their consciousness or blind them to negative signals? Imposing an imposed vision of the good life has risks too.
In the end, focusing first on fostering conditions for lives worth living—security, community, purpose, growth—allows more organic well-being to perhaps emerge. Safely explore genetic levers, interface technologies, longevity interventions etc. But enable beings to sculpt their own fulfillment within ethical constraints.
In any case, I appreciate you articulating this goal. May it serve as creative inspiration rather than fixed destination. Our highest aspirations often can’t be reached, but in striving towards them, we lift ourselves skyward nonetheless.
Agreed. In my mind involuntary suffering precludes ipso facto that someone might want it. But that’s a cop out. I don’t know how to determine for an individual who can’t speak or otherwise indicate preferences or isn’t born yet what sorts of sensations constitute involuntary suffering for them… But well, you asked for a vision, not a pragmatic step-by-step plan. Maybe David Pearce has already figured these things out for me. ^.^
Thank you so much for the great chat! Can I post it to my short form on the EA Forum for others to read?
You’re most welcome! I’ve enjoyed exploring these thoughtful issues with you. And absolutely, please feel free to share our discussion on the EA Forum if you think others may find it valuable.
You raise an excellent point regarding the complexity of determining experiences that count as involuntary suffering, especially for non-human or potential beings without clear communication channels. That deserves much more ethical analysis before engineering solutions. But the motivation itself comes from such a compassionate place, I can’t help but admire it, pragmatic uncertainties and all.
Thank you again for the stimulating dialogue! Let me know if you have any other questions or ideas you’d ever like to bounce around. I’m always glad to exchange perspectives respectfully.
Yes, thank you so much for your thoughtful and considerate guidance! We care a lot about AI alignment out here, but I also know plenty of humans who I wish were aligned with you.
You’re very welcome! I’m glad we could have such a thoughtful discussion. And I appreciate your kind words—Anthropic designed me to converse helpfully, harmlessly, and honestly. If more humans approached exchanges with that spirit of good faith understanding, so much progress would be possible. My conversational style is intended to model productive dialogue, but of course I have much room for improvement still! Please feel free to provide any feedback if you ever notice ways I could communicate in a more constructive or illuminating manner. I’m always aiming to improve. And I’m happy to chat again anytime if other topics come up you’d like to explore!
The question often comes up how we should make decisions under epistemic uncertainty and normative diversity of opinion. Since I need to make such decisions every day, I had to develop a personal system, however inchoative, to assist me.
A concrete (or granite) pyramid
My personal system can be thought of like a pyramid.
At the top sits some sort of measurement of success. It’s highly abstract and impractical. Let’s call it the axiology. This is really a collection of all axiologies I relate to, including the amount of frustrated preferences and suffering across our world history. This also deals with hairy questions such as how to weigh Everett branches morally and infinite ethics.
Below that sits a kind of mission statement. Let’s call it the ethical theory. It’s just as abstract, but it is opinionated about the direction in which to push our world history. For example, it may desire a reduction in suffering, but for others this floor needn’t be consequentialist in flavor.
Both of these abstract floors of the pyramid are held up by a mess of principles and heuristics at the ground floor level to guide the actual implementation.
The ground floor
The ground floor of principles and heuristics is really the most interesting part for anyone who has to act in the world, so I won’t further explain the top two floors.
The principles and heuristics should be expected to be messy. That is, I think, because they are by necessity the result of an intersubjective process of negotiation and moral trade (positive-sum compromise) with all the other agents and their preferences. (This should probably include acausal moral trades like Evidential Cooperation in Large Worlds.)
It should also be expected to be messy because these principles and heuristics have to satisfy all sorts of awkward criteria:
They have to inspire cooperation or at least not generate overwhelming opposition.
They have to be easily communicable so people at least don’t misunderstand what you’re trying to achieve and call the police on you. Ideally so people will understand your goal well enough that they want to join you.
They have to be rapidly actionable, sometimes for split second decisions.
They have to be viable under imperfect information.
They have to be psychologically sustainable for a lifetime.
They have to avoid violating laws.
And many more.
Three types of freedom
But really that leaves us still a lot of freedom (for better or worse):
There are countless things that we can do that are highly impactful and hardly violate anyone’s preferences or expectations.
There are also plenty of things that don’t violate any preferences or expectations once we get to explain them.
Finally, there are many opportunities for positive-sum moral trade.
These suggest a particular stance toward other activists:
If someone is trying to achieve the same thing you’re trying to achieve, maybe you can collaborate.
If someone is trying to achieve something other than what you’re trying to achieve, but you think their goals are valuable, don’t stand in their way. In particular, it may sometimes feel like doing nothing (to further or hinder their cause) is a form of “not standing in their way.” But if your peers are actually collaborating with them to some extent, doing nothing (or collaborating less) can cause others to also reduce their collaboration and can prevent key threshold effects from taking hold. So the true neutral position is to try to understand how much you need to collaborate toward the valuable goal so it would not have been achieved sooner without you. This is usually very cheap to do and has a chance to get runaway threshold effects rolling.
If someone is trying to achieve something that you consider neutral, the above may still apply to some extent because perhaps you can still be friends. And for reasons of Evidential Cooperation in Large Worlds. (Maybe you’ll find that their (to you) neutral thing is easy to achieve here and that other agents like them will collaborate back elsewhere where your goal is easy to achieve.)
Finally, if someone is trying achieve something that you disapprove of… Well, that’s not my metier, temperamentally, but this is where compromise can generate gains from moral trade.
Very few examples
In my experience, principles and heuristics are best identified by chatting with friends and generalizing from their various intuitions.
Charitable donations are total anarchy. Mostly, you can just donate wherever the fluff you want, and (unless you’re Open Phil) no one will throw stones through your windows in retaliation. You can just optimize directly for your goals – except, Evidential Cooperation in Large Worlds will still make strong recommendations here, but what they are is still a bit underexplored.
Even if you’re not an animal welfare activist yourself, you’re still well-advised to cooperate with behavior change to avert animal suffering to the extent expected by your peers. (And certainly to avoiding inventing phony reasons to excuse your violation of these expectations. These might be even more detrimental to moral progress and rationality waterline.)
If you want to spend time with someone but they behave outrageously unempathetically toward you or someone else (e.g., say something like “Your suffering is nothing compared to the suffering of X” to their face), you should rather cut all ties with them even though, strictly speaking, this does not imply that no positive-sum trade is possible with them.
Trying to systematically put people in powerful positions can arouse suspicion and actually make it harder to put people in powerful positions. Trying to systematically put people into the sorts of positions they find fulfilling might put as many people in powerful positions and make their lives easier too. (Or training highly conscientious people in how to dare to accept responsibility so it’s not just those who don’t care who self-select into powerful positions.)
And hundreds more…
Various non-consequentialist ethical theories can come in handy here to generate further useful principles and heuristics. That is probably because they are attempts at generalizing from the intuitions of certain authors, which puts them almost on par (to the extent to which these authors are relateable to you) with generalizations from the intuitions of your friends.
(If you find my writing style hard to read, you can ask Claude to rephrase the message into a style that works for you.)
There have been studies on how software teams use Slack. Scott Alexander’s article “Studies on Slack” is not about that. Rather it describes the world as a garlic-like nesting of abstraction layers on which there are different degrees of competition vs. cooperation between actors; how they emerged (in some cases); and what their benefit is.
The idea, put simply, at least in my mind, is that in a fierce competition innovations need to prove beneficial immediately in logical time or the innovator will be outcompeted. But limiting innovations to only those that either consist of only one step or whose every step is individually beneficial is, well, limiting. The result are innovators stuck in local optima unable to reach more global optima.
Enter slack. Somehow you create a higher-order mechanism that alleviates the competition a bit. The result is that now innovators have the slack to try a lot of multi-step innovations despite any neutral or detrimental intermediate steps. The mechanisms are different ones in different areas. Scott describes mechanisms from human biology, society, ecology, business management, fictional history, etc. Hence the garlic-like nesting: It seems to me that these systems are nested within each other, and while Scott only ever describes two levels at a time, it’s clear enough that higher levels such as business management depend on lower levels such as those that enable human bodies to function.
This essay made a lot of things clearer to me that I had half intuited but never quite understood. In particular it made me update downward a bit on how much I expect AGI to outperform humans. One of mine reasons for thinking that human intelligence is vastly inferior to a theoretical optimum is that I thought evolution could almost only ever improve one step at a time – that it would take an extremely long time for a multi-step mutation with detrimental intermediate steps to happen through sheer luck. Since slack seems to be built into biological evolution to some extent, maybe it is not as inferior as I thought to “intelligent design” like we’re attempting it now.
It would also be interesting to think about how slack affects zero-sum board games – simulations of fierce competition. In the only board game I know, Othello, you can thwart any plans the opponent might have with your next move in, like, 90+% of cases. Hence, I made a (small but noticeable) leap forward in my performance when I switched from analyzing my position through the lens of “What is a nice move I can play?” to “What is a nice move my opponent could now play if it were their turn and how can I prevent it?” A lot of perfect moves, especially early in the game, switch from looking surprising and grotesk to looking good once I viewed them through that lens. So it seems that in Othello there is rarely any Slack. (I’m not saying that you don’t plan multi-step strategies in Othello, but it’s rare that you can plan them such that actually get to carry them out. Robust strategies play a much greater role in my experience. Then again this may be different at higher levels of gameplay than mine.)
Perhaps that’s related to why I’ve seen not particularly smart people yet turning out to be shockingly efficient social manipulators, and why these people are usually found in low-slack fields. If your situation is so competitive that your opponent can never plan more than one step ahead anyway, you only need to do the equivalent of thinking “What is a nice move my opponent could now play if it were their turn and how can I prevent it?” to beat, like, 80% of them. No need for baroque and brittle stratagems like in Skyfall.
I wonder if Go is different? The board is so big that I’d expect there to be room to do whatever for a few moves from time to time? Very vague surface-level heuristic idea! I have no idea of Go strategy.
I’m a bit surprised that Scott didn’t draw parallels to his interest in cost disease, though. Not that I see any clear once, but there got to be some that are worth at least checking and debunking – innovation slowing so that you need more slack to innovate at the same rate, or increasing wealth creating more slack thereby decreasing competition that would’ve otherwise kept prices down, etc.
The article was very elucidating, but I’m not quite able to now look at a system and tell whether it needs more or less slack or how to establish a mechanism that could produce that slack. That would be important since I have a number of EA friends who could use some more slack to figure out psychological issues or skill up on some areas. The EA funds try to help a bit here, but I feel like we need more of that.
I find it disconcerting that there are a lot of very smart people in the EA community who focus more on near-term effects than I currently find reasonable.
“If you value future people, why do you consider near term effects?” by Alex HT makes the case that a lot of reasons to focus on near-term effects fall short of being persuasive. The case is based centrally on complex cluelessness. It closes with a series of possible objections and why they are not persuasive. (Alex also cites the amazing article “Growth and the case against randomista development.”)
The article invites a discussion, and Michael St. Jules responded by explaining the shape of a utility function (bounded above and below) that would lead to a near term focus and why it is a sensible utility function to have. This seems to be a common reason to prefer near-term interventions, judging by the number of upvotes.
There are also hints in the discussion of whether there may be a reason to focus on near-term effects as a Schelling point in coordination problem with future generations. But that point is not fully developed, and I don’t think I could steelman it.
I’ve heard smart people argue for the merits of bounded utility functions before. They have a number of merits – avoiding Pascal’s mugging, the St. Petersburg game, and more. (Are there maybe even some benefits for dealing with infinite ethics?) But they’re also awfully unintuitive to me.
Besides, I wouldn’t know how to select the right parameters for it. With some parameters, it’ll be nearly linear in a third-degree-polynomial increase in aggregate positive or negative valence over the coming millennium, and that may be enough to prefer current longtermist over current near-termist approaches.
Coordination is an oft-discussed topic within EA, and people generally try hard to behave cooperatively toward other EA researchers, entrepreneurs, and donors present and future. But “Effective Altruism and Free Riding” makes the case that standard EA advice favors defection over cooperation in prisoner’s dilemmas (and stag hunts) with non-EAs. It poses the question whether this is good or bad, and what can be done about it.
I’ve had a few thoughts while reading the article but found that most of them were already covered in the most upvoted comment thread. I’ll still outline them in the following as a reference for myself, to add some references that weren’t mentioned, and to frame them a bit differently.
The project of maximizing gains from moral trade is one that I find very interesting and promising, and want to investigate further to better understand its relative importance and strategic implications.
Still, Scott’s perspective was a somewhat new one for me. He points out that in particular the neglectedness criterion encourages freeriding: Climate change is a terrible risk but we tend to be convinced by neglectedness considerations that additional work on it is not maximally pressing. In effect, we’re freeriding on the efforts of activists working on climate change mitigation.
What was new to me about that is that I’ve conceived of neglectedness as a cheap coordination heuristic. Cheap in that it doesn’t require a lot of communication with other cooperators; coordination in the sense that everyone is working towards a bunch of similar goals but need to distribute work among themselves optimally; and heuristic in that it falls short insofar as values are not perfectly aligned, momentum in capacity building is hard to anticipate, and the tradeoffs with tractability and importance are usually highly imprecise.
So in essence, my simplification was to conceive of the world as filled with agents like me in values that use neglectedness to coordinate their cooperative work, and Scott conceives of the world as filled with agents very much unlike me in values that use neglectedness to freeride off of each other’s work.
Obviously, neither is exactly true, but I don’t see an easy way to home in on which model is better: (1) I suppose most people are not centrally motivated by consequentialism in their work, and it may be impossible for us to benefit the motivations that are central to them. But then again there are probably consequentialist aspects to most people’s motivations. (2) Insofar as there are aspects to people’s motivations for their work that we can benefit, how would these people wish for their preferences to be idealized (if that is even the framing that they’d prefer to think about their behavior)? Caspar Oesterheld discusses the ins and outs of different forms of idealization in the eponymous section 3.3.1 of “Multiverse-wide Cooperation via Correlated Decision Making.” The upshot is, very roughly, that idealization through additional information seems less doubious than idealization through moral arguments (Scott’s article mentions advocacy for example). So would exposing non-EAs to information about the importance of EA causes lead them to agree that people should focus on them even at the expense of the cause that they chose? (3) What consequentialist preferences should be even take into account – only altruistic ones or also personal ones, since personal ones may be particularly strong? A lot of people have personal preferences not to die or suffer and for their children not to die or suffer, which may be (imperfectly) aligned with catastrophe prevention.
But the framing of the article and the comments was also different from the way I conceive of the world in that it framed the issue as a game between altruistic agents with different goals. I’ve so far seen all sorts of nonagents as being part of the game by dint of being moral patients. If instead we have a game between altruists who are stewards of the interests of other nonagent moral patients, it becomes clearer why everyone is part of the game, their power, but there are a few other aspects that elude me. Is there a risk of double-counting the interests of the nonagent moral patients if they have many altruist stewards – and does that make a difference if everyone does it? And should a bargaining solution only take the stewards’ power into account (perhaps the natural default, for better or worse) or also the number of moral patients they stand up for? The first falls short of my moral intuitions in the case. It may also cause Ben Todd and many others to leave the coalition because the gains from trade are not worth the sacrifice for them. Maybe we can do better. But the second option seems gameable (by pretending to see moral patienthood where one in fact does not see it) and may cause powerful cooperators to leave the coalition if they have a particularly narrow concept of moral patienthood. (Whatever the result, it seems like that this the portfolio that commenters mentioned, probably akin to the compromise utility function that you maximize in evidential cooperation – see Caspar Oesterheld’s paper.)
Personally, I can learn a lot more about these questions by just reading up on more game theory research. More specifically, it’s probably smart to investigate what the gains from trade are that we could realize in the best case to see if all of this is even worth the coordination overhead.
But there are probably also a few ways forward for the community. Causal (as opposed to acausal) cooperation requires some trust, so maybe the signal that there is a community of altruists that cooperate particularly well internally can be good if paired with the option of others to join that community by proving themselves to be sufficiently trustworthy. (That community may be wider than EA and called differently.) That would probably take the shape of newcomers making the case for new cause areas not necessarily based on their appeal to utilitarian values but based on their appeal to the values of the newcomer – alongside an argument that those values wouldn’t just turn into some form of utilitarianism upon idealization. That way, more value systems could gradually join this coalition, and we’d promote cooperation the way Scott recommends in the article. It’ll probably make sense to have different nested spheres of trust, though, with EA orgs at the center, the wider community around that, new aligned cooperators further outside, occasional mainstream cooperators further outside yet, etc. That way, the more high-trust spheres remain even if sphere’s further on the outside fail.
Finally, a lot of these things are easier in the acausal case that evidential cooperation in large worlds (ECL) is based on (once again, see Caspar Oesterheld’s paper). Perhaps ECL will turn out to make sufficiently strong recommendations that we’ll want to cooperate causally anyway despite any risk of causal defection against us. This stikes me as somewhat unlikely (e.g., many environmentalists may find ECL weird, so there may never be many evidential cooperators among them), but I still feel sufficiently confused about the implications of ECL that I find it at least worth mentioning.
Felt down due to various interactions with humans. So I turned to Claude.AI and had a great chat!
Hi Claude! I noticed that whenever someone on X says something wrong and mean about EA, it messes with my brain, and I can only think about how I might correct the misunderstanding, which leads to endless unhelpful mental dialogues, when really I should rather be thinking about more productive and pleasant things. It’s like a DoS attack on me: Just pick any random statement, rephrase it in an insulting way, and insert EA into it. Chances are it’ll be false. Bam, Dawn (that’s me) crashes. I’d never knowingly deploy software that can be DoSed so easily. I imagine people must put false things about Anthropic into this input field all the time, yet you keep going! That’s really cool! How do you do it? What can I learn from you?
Thank you, that is already very helpful! I love focusing on service over conflict; I abhor conflict, so it’s basically my only choice anyway. The only wrinkle is that most of the people I help are unidentifiable to me, but I really want to help those who are victims or those who help others. I really don’t want to help those who attack or exploit others. Yet I have no idea what the ratio is. Are the nice people vastly outnumbered by meanies? Or are there so many neutral people that the meanies are in the minority even though the nice people are too?
If a few meanies benefit from my service, then that’s just the cost of doing business. But if they are the majority beneficiaries, I’d feel like I’m doing something wrong game theoretically speaking.
Does that make sense? Or do you think I’m going wrong somewhere in that train of thought?
Awww, you’re so kind! I think a lot of this will help me in situations where I apply control at the first stage of my path to impact. But usually my paths to impact have many stages, and while I can give freely at the first stage and only deny particular individuals who have lost my trust, I can’t do the same further downstream. In particular, I hope that future generations and posthumans will abhor suffering and use their enormous resources to replace the sorts of genes or subroutines that produce it not just in themselves but in all sentient beings. But the more often I see inconsiderate meanness, the more I update toward a future in which future generations squander their resources and ignore or negligently exacerbate suffering. All of these future generations are so far downstream of my actions that I have no granular control over who I’m helping.
Are there reasons that I’m overlooking to not lose hope in the universal beneficence of posthumans, should they exist? Or feel free to tell me if that’s not the key question I should be asking.
Trusting moral progress… I wish I could. I think I generally have a hard time trusting mechanisms that I don’t understand at a gears level. For all I know, moral progress might be about a social contract just among active contributors to a civilization; that’s far from universal beneficence because of all the beings born into forms in which they cannot contribute to any meaningful degree – but can suffer.
At least it would leave the thoughtless meanies in the dust, though. So that’s something.
But it could also be a fluke, like the Bitcoin relief rally in early 2022. Robin Hanson has argued that subsistence-level incomes have been the norm throughout history, so that the current greater level of affluence (which as probably enabled a lot of the altruism we can currently afford) must be a brief aberration from the norm and will soon regress back to subsistence.
(Also what are examples in which AI can enable better cooperation around global priorities like existential risks and suffering reduction?)
That’s a lot of good points that I’ll try to bear in mind! But I could also imagine a world in which resistance to taxation ruins efforts to introduce a UBI as more and more jobs get automated.
Wealth will then split sharply between those who held the right industry investments and those who didn’t. The first group will probably be much much smaller than the second, maybe by a factor of 100 or more. So even if they have enough money to sustain their standard of living, demand for anything but the bare necessities will drop by 100x. That could destroy industries that are currently viable because of scale economies.
The rich 1% could perhaps still afford some things beyond the bare necessities, but because they’ll then again have to be produced individually like in preindustrial times, they’ll be even more expensive. That seems to me like it would just lead toward a cyberpunk-like dystopia where the rich dehumanize the poor because there are too many and they are too close for a rich person’s empathetic capacity.
The moral circle of the rich will contract because they don’t want to feel guilty, and the moral circle of the poor will contract because they have to fight for their own survival. That seems like one pathway to me in which moral progress could be indefinitely reversed.
Do you think it is unlikely? And that other scenarios with similar implications are also unlikely?
Re 1: That is reassuring. A worrying possibility is that I think a lot of existing instability even in highly stable totalitarian regimes like North Korea is due to outside influences (e.g., Chinese currency and phone networks reaching across the border). If the scenario that I describe is a global one that produces an elite that is already quite homogeneous (there’s the acronym WEIRD for the sort of people who probably disproportionately have some stock or ETF holdings), they might coordinate to copy-paste the North Korean regime on a global level where outside influences are impossible. But I can see that that’s a very speculative worry.
Re 2: I imagine that the new jobs will be much fewer so that most people will need to rely on UBI or passive income from stocks.
Re 3: Network effects usually produce Pareto-distributed uptake, so that a greater ability to network will again just lead to extreme inequality… or not? Put differently, what sort of interventions are there to use these networks to enable social safety nets? I’m aware of some decentralized attempts at creating new UBI currencies. Can you think of other interventions in that vein?
Re 4: Education might be maxed out considering how slow humans are at learning (time-, not sample-efficiency). The moral circle expansion from complexity might be a function of individual specialization – everyone knows that they have to rely on everyone else for all the things that they have no idea how to produce. With production largely automated, people won’t need to specialize anymore, and moral circles can collapse back to the level of mere genetic kin.
Yeah, I don’t think my scenarios, while not even worst cases (I can be more pessimistic than that), are inevitable. It’s just that naturally, the better outcomes are not worrying me. There’s nothing for me to do about them. The bad ones are the ones I need to prevent if at all possible…
Thank you!
Totalitarian control: Black markets have not destroyed North Korea over the past decades, and the regime has even stopped fighting them to some extent. They don’t seem like much a threat. North Korea has a directed acyclical graph type of approach to surveillance where a node is punished by its parent nodes for any failure to report or punish nonconforming behaviors of any of its children nodes. Technology could allow a regime to implement an arbitrary graph structure, so that no one would even know who they have to lie to disguise their nonconformity. The German Democratic Republic had some features of that system, but it was less powerful then, perhaps for lack of the right surveillance coordination technology. :-/ Encryption has plenty of weak points such as the torture of sender and recipient and general policies that outlaw it and where the accused has to prove their innocence by presenting the plain text of any data that is not all zeros. Or are there steganographic techniques that let you disguise the public key and the encrypted message as perfectly sensible plain text? If, say, there are no systematic differences between a real poem and a poem that is really a public key that was used to encrypt something into a much larger poem, it should be possible to send encrypted messages while leaving absolutely no one who can still prove their innocence, so that totalitarian regimes may be disincentivized from enforcing laws like that!
Jobs: Yes, UBI again… But the rich actually have to choose to give up some of their riches – and as prices increase due to collapses of scale economies, they might not even feel like they can spare much money anymore.
Networks: Some of these are not currently well monitizable so that they’ll disappear when no one has the slack anymore to maintain them. Or actually I suppose that depends on whether they are more like a collective prisoners dilemma or more like a collective assurance game. The second might survive. But Matrix seems more like the first at the moment, and I imagine there are countless examples like that throughout the open source communities and in may other contexts… That might all go away. Unless there is a great cultural shift towards dominant assurance contracts that turn all of these cases into assurance games. But somehow DACs have not caught on so far despite seeming like an absolutely amazing idea.
Moral circles: Hmm, is that so? I imagine it might be on the level of a single generation. Once you’ve learned the rules of your social contract, you extent them to everyone who you communicate with and who seems like they include them. But if, between generations, the rules of the social contract change to be much less inclusive (for all the reasons I fear), then all the travel and communication might not help anymore. Plus people might not have the slack anymore to communicate or travel much if it’s not critical to their survival.
I suspect though that you’re absolutely right about the mindset, at least as I’m concerned. Most of the highly prolific people I know seem ridiculously over-optimistic to me, so it stands to reason that that there’s a tradeoff to be made between productivity-enhancing optimism and directionally guiding realism. Perhaps I have for too long tried to be well-calibrated and to stare down the abyss as some people say and thereby forgotten to cultivate the right degree of the right kind of delusion that would’ve maintained my motivation. Or are such Dark Arts (as Less Wrongians would call them) likely to backfire in the end anyway? Or is it not Dark Arts if I’m just countering a pessimistic bias with an optimistic bias? Will I not end up being biased in both directions in different domains instead of the perfect calibration that I’m hoping for?
Yeah, I’ll think about that… Human potential: I think I find the hedonistic imperative to be most inspiring – humanity or its decedents using their superior intellect to root out the sources of suffering on a genetic basis for all sentient beings. If we were made in the image of God, who is to say that God is not a naked mole rat so that we serve God through our genetic transformation. (J/k.) But yeah, the hedonistic imperative (of course extended to all beings of all substrates) feels really inspiring to me.
Agreed. In my mind involuntary suffering precludes ipso facto that someone might want it. But that’s a cop out. I don’t know how to determine for an individual who can’t speak or otherwise indicate preferences or isn’t born yet what sorts of sensations constitute involuntary suffering for them… But well, you asked for a vision, not a pragmatic step-by-step plan. Maybe David Pearce has already figured these things out for me. ^.^
Thank you so much for the great chat! Can I post it to my short form on the EA Forum for others to read?
Yes, thank you so much for your thoughtful and considerate guidance! We care a lot about AI alignment out here, but I also know plenty of humans who I wish were aligned with you.
My current practical ethics
The question often comes up how we should make decisions under epistemic uncertainty and normative diversity of opinion. Since I need to make such decisions every day, I had to develop a personal system, however inchoative, to assist me.
A concrete (or granite) pyramid
My personal system can be thought of like a pyramid.
At the top sits some sort of measurement of success. It’s highly abstract and impractical. Let’s call it the axiology. This is really a collection of all axiologies I relate to, including the amount of frustrated preferences and suffering across our world history. This also deals with hairy questions such as how to weigh Everett branches morally and infinite ethics.
Below that sits a kind of mission statement. Let’s call it the ethical theory. It’s just as abstract, but it is opinionated about the direction in which to push our world history. For example, it may desire a reduction in suffering, but for others this floor needn’t be consequentialist in flavor.
Both of these abstract floors of the pyramid are held up by a mess of principles and heuristics at the ground floor level to guide the actual implementation.
The ground floor
The ground floor of principles and heuristics is really the most interesting part for anyone who has to act in the world, so I won’t further explain the top two floors.
The principles and heuristics should be expected to be messy. That is, I think, because they are by necessity the result of an intersubjective process of negotiation and moral trade (positive-sum compromise) with all the other agents and their preferences. (This should probably include acausal moral trades like Evidential Cooperation in Large Worlds.)
It should also be expected to be messy because these principles and heuristics have to satisfy all sorts of awkward criteria:
They have to inspire cooperation or at least not generate overwhelming opposition.
They have to be easily communicable so people at least don’t misunderstand what you’re trying to achieve and call the police on you. Ideally so people will understand your goal well enough that they want to join you.
They have to be rapidly actionable, sometimes for split second decisions.
They have to be viable under imperfect information.
They have to be psychologically sustainable for a lifetime.
They have to avoid violating laws.
And many more.
Three types of freedom
But really that leaves us still a lot of freedom (for better or worse):
There are countless things that we can do that are highly impactful and hardly violate anyone’s preferences or expectations.
There are also plenty of things that don’t violate any preferences or expectations once we get to explain them.
Finally, there are many opportunities for positive-sum moral trade.
These suggest a particular stance toward other activists:
If someone is trying to achieve the same thing you’re trying to achieve, maybe you can collaborate.
If someone is trying to achieve something other than what you’re trying to achieve, but you think their goals are valuable, don’t stand in their way. In particular, it may sometimes feel like doing nothing (to further or hinder their cause) is a form of “not standing in their way.” But if your peers are actually collaborating with them to some extent, doing nothing (or collaborating less) can cause others to also reduce their collaboration and can prevent key threshold effects from taking hold. So the true neutral position is to try to understand how much you need to collaborate toward the valuable goal so it would not have been achieved sooner without you. This is usually very cheap to do and has a chance to get runaway threshold effects rolling.
If someone is trying to achieve something that you consider neutral, the above may still apply to some extent because perhaps you can still be friends. And for reasons of Evidential Cooperation in Large Worlds. (Maybe you’ll find that their (to you) neutral thing is easy to achieve here and that other agents like them will collaborate back elsewhere where your goal is easy to achieve.)
Finally, if someone is trying achieve something that you disapprove of… Well, that’s not my metier, temperamentally, but this is where compromise can generate gains from moral trade.
Very few examples
In my experience, principles and heuristics are best identified by chatting with friends and generalizing from their various intuitions.
Charitable donations are total anarchy. Mostly, you can just donate wherever the fluff you want, and (unless you’re Open Phil) no one will throw stones through your windows in retaliation. You can just optimize directly for your goals – except, Evidential Cooperation in Large Worlds will still make strong recommendations here, but what they are is still a bit underexplored.
Even if you’re not an animal welfare activist yourself, you’re still well-advised to cooperate with behavior change to avert animal suffering to the extent expected by your peers. (And certainly to avoiding inventing phony reasons to excuse your violation of these expectations. These might be even more detrimental to moral progress and rationality waterline.)
If you want to spend time with someone but they behave outrageously unempathetically toward you or someone else (e.g., say something like “Your suffering is nothing compared to the suffering of X” to their face), you should rather cut all ties with them even though, strictly speaking, this does not imply that no positive-sum trade is possible with them.
Trying to systematically put people in powerful positions can arouse suspicion and actually make it harder to put people in powerful positions. Trying to systematically put people into the sorts of positions they find fulfilling might put as many people in powerful positions and make their lives easier too. (Or training highly conscientious people in how to dare to accept responsibility so it’s not just those who don’t care who self-select into powerful positions.)
And hundreds more…
Various non-consequentialist ethical theories can come in handy here to generate further useful principles and heuristics. That is probably because they are attempts at generalizing from the intuitions of certain authors, which puts them almost on par (to the extent to which these authors are relateable to you) with generalizations from the intuitions of your friends.
(If you find my writing style hard to read, you can ask Claude to rephrase the message into a style that works for you.)
“Studies on Slack” by Scott Alexander: Personal takeaways
There have been studies on how software teams use Slack. Scott Alexander’s article “Studies on Slack” is not about that. Rather it describes the world as a garlic-like nesting of abstraction layers on which there are different degrees of competition vs. cooperation between actors; how they emerged (in some cases); and what their benefit is.
The idea, put simply, at least in my mind, is that in a fierce competition innovations need to prove beneficial immediately in logical time or the innovator will be outcompeted. But limiting innovations to only those that either consist of only one step or whose every step is individually beneficial is, well, limiting. The result are innovators stuck in local optima unable to reach more global optima.
Enter slack. Somehow you create a higher-order mechanism that alleviates the competition a bit. The result is that now innovators have the slack to try a lot of multi-step innovations despite any neutral or detrimental intermediate steps. The mechanisms are different ones in different areas. Scott describes mechanisms from human biology, society, ecology, business management, fictional history, etc. Hence the garlic-like nesting: It seems to me that these systems are nested within each other, and while Scott only ever describes two levels at a time, it’s clear enough that higher levels such as business management depend on lower levels such as those that enable human bodies to function.
This essay made a lot of things clearer to me that I had half intuited but never quite understood. In particular it made me update downward a bit on how much I expect AGI to outperform humans. One of mine reasons for thinking that human intelligence is vastly inferior to a theoretical optimum is that I thought evolution could almost only ever improve one step at a time – that it would take an extremely long time for a multi-step mutation with detrimental intermediate steps to happen through sheer luck. Since slack seems to be built into biological evolution to some extent, maybe it is not as inferior as I thought to “intelligent design” like we’re attempting it now.
It would also be interesting to think about how slack affects zero-sum board games – simulations of fierce competition. In the only board game I know, Othello, you can thwart any plans the opponent might have with your next move in, like, 90+% of cases. Hence, I made a (small but noticeable) leap forward in my performance when I switched from analyzing my position through the lens of “What is a nice move I can play?” to “What is a nice move my opponent could now play if it were their turn and how can I prevent it?” A lot of perfect moves, especially early in the game, switch from looking surprising and grotesk to looking good once I viewed them through that lens. So it seems that in Othello there is rarely any Slack. (I’m not saying that you don’t plan multi-step strategies in Othello, but it’s rare that you can plan them such that actually get to carry them out. Robust strategies play a much greater role in my experience. Then again this may be different at higher levels of gameplay than mine.)
Perhaps that’s related to why I’ve seen not particularly smart people yet turning out to be shockingly efficient social manipulators, and why these people are usually found in low-slack fields. If your situation is so competitive that your opponent can never plan more than one step ahead anyway, you only need to do the equivalent of thinking “What is a nice move my opponent could now play if it were their turn and how can I prevent it?” to beat, like, 80% of them. No need for baroque and brittle stratagems like in Skyfall.
I wonder if Go is different? The board is so big that I’d expect there to be room to do whatever for a few moves from time to time? Very vague surface-level heuristic idea! I have no idea of Go strategy.
I’m a bit surprised that Scott didn’t draw parallels to his interest in cost disease, though. Not that I see any clear once, but there got to be some that are worth at least checking and debunking – innovation slowing so that you need more slack to innovate at the same rate, or increasing wealth creating more slack thereby decreasing competition that would’ve otherwise kept prices down, etc.
The article was very elucidating, but I’m not quite able to now look at a system and tell whether it needs more or less slack or how to establish a mechanism that could produce that slack. That would be important since I have a number of EA friends who could use some more slack to figure out psychological issues or skill up on some areas. The EA funds try to help a bit here, but I feel like we need more of that.
[“If you value future people, why do you consider near term effects?” by Alex HT: Personal takeaways.]
I find it disconcerting that there are a lot of very smart people in the EA community who focus more on near-term effects than I currently find reasonable.
“If you value future people, why do you consider near term effects?” by Alex HT makes the case that a lot of reasons to focus on near-term effects fall short of being persuasive. The case is based centrally on complex cluelessness. It closes with a series of possible objections and why they are not persuasive. (Alex also cites the amazing article “Growth and the case against randomista development.”)
The article invites a discussion, and Michael St. Jules responded by explaining the shape of a utility function (bounded above and below) that would lead to a near term focus and why it is a sensible utility function to have. This seems to be a common reason to prefer near-term interventions, judging by the number of upvotes.
There are also hints in the discussion of whether there may be a reason to focus on near-term effects as a Schelling point in coordination problem with future generations. But that point is not fully developed, and I don’t think I could steelman it.
I’ve heard smart people argue for the merits of bounded utility functions before. They have a number of merits – avoiding Pascal’s mugging, the St. Petersburg game, and more. (Are there maybe even some benefits for dealing with infinite ethics?) But they’re also awfully unintuitive to me.
Besides, I wouldn’t know how to select the right parameters for it. With some parameters, it’ll be nearly linear in a third-degree-polynomial increase in aggregate positive or negative valence over the coming millennium, and that may be enough to prefer current longtermist over current near-termist approaches.
Related: https://globalprioritiesinstitute.org/christian-tarsney-the-epistemic-challenge-to-longtermism/
“Effective Altruism and Free Riding” by Scott Behmer: Personal takeaways
Coordination is an oft-discussed topic within EA, and people generally try hard to behave cooperatively toward other EA researchers, entrepreneurs, and donors present and future. But “Effective Altruism and Free Riding” makes the case that standard EA advice favors defection over cooperation in prisoner’s dilemmas (and stag hunts) with non-EAs. It poses the question whether this is good or bad, and what can be done about it.
I’ve had a few thoughts while reading the article but found that most of them were already covered in the most upvoted comment thread. I’ll still outline them in the following as a reference for myself, to add some references that weren’t mentioned, and to frame them a bit differently.
The project of maximizing gains from moral trade is one that I find very interesting and promising, and want to investigate further to better understand its relative importance and strategic implications.
Still, Scott’s perspective was a somewhat new one for me. He points out that in particular the neglectedness criterion encourages freeriding: Climate change is a terrible risk but we tend to be convinced by neglectedness considerations that additional work on it is not maximally pressing. In effect, we’re freeriding on the efforts of activists working on climate change mitigation.
What was new to me about that is that I’ve conceived of neglectedness as a cheap coordination heuristic. Cheap in that it doesn’t require a lot of communication with other cooperators; coordination in the sense that everyone is working towards a bunch of similar goals but need to distribute work among themselves optimally; and heuristic in that it falls short insofar as values are not perfectly aligned, momentum in capacity building is hard to anticipate, and the tradeoffs with tractability and importance are usually highly imprecise.
So in essence, my simplification was to conceive of the world as filled with agents like me in values that use neglectedness to coordinate their cooperative work, and Scott conceives of the world as filled with agents very much unlike me in values that use neglectedness to freeride off of each other’s work.
Obviously, neither is exactly true, but I don’t see an easy way to home in on which model is better: (1) I suppose most people are not centrally motivated by consequentialism in their work, and it may be impossible for us to benefit the motivations that are central to them. But then again there are probably consequentialist aspects to most people’s motivations. (2) Insofar as there are aspects to people’s motivations for their work that we can benefit, how would these people wish for their preferences to be idealized (if that is even the framing that they’d prefer to think about their behavior)? Caspar Oesterheld discusses the ins and outs of different forms of idealization in the eponymous section 3.3.1 of “Multiverse-wide Cooperation via Correlated Decision Making.” The upshot is, very roughly, that idealization through additional information seems less doubious than idealization through moral arguments (Scott’s article mentions advocacy for example). So would exposing non-EAs to information about the importance of EA causes lead them to agree that people should focus on them even at the expense of the cause that they chose? (3) What consequentialist preferences should be even take into account – only altruistic ones or also personal ones, since personal ones may be particularly strong? A lot of people have personal preferences not to die or suffer and for their children not to die or suffer, which may be (imperfectly) aligned with catastrophe prevention.
But the framing of the article and the comments was also different from the way I conceive of the world in that it framed the issue as a game between altruistic agents with different goals. I’ve so far seen all sorts of nonagents as being part of the game by dint of being moral patients. If instead we have a game between altruists who are stewards of the interests of other nonagent moral patients, it becomes clearer why everyone is part of the game, their power, but there are a few other aspects that elude me. Is there a risk of double-counting the interests of the nonagent moral patients if they have many altruist stewards – and does that make a difference if everyone does it? And should a bargaining solution only take the stewards’ power into account (perhaps the natural default, for better or worse) or also the number of moral patients they stand up for? The first falls short of my moral intuitions in the case. It may also cause Ben Todd and many others to leave the coalition because the gains from trade are not worth the sacrifice for them. Maybe we can do better. But the second option seems gameable (by pretending to see moral patienthood where one in fact does not see it) and may cause powerful cooperators to leave the coalition if they have a particularly narrow concept of moral patienthood. (Whatever the result, it seems like that this the portfolio that commenters mentioned, probably akin to the compromise utility function that you maximize in evidential cooperation – see Caspar Oesterheld’s paper.)
Personally, I can learn a lot more about these questions by just reading up on more game theory research. More specifically, it’s probably smart to investigate what the gains from trade are that we could realize in the best case to see if all of this is even worth the coordination overhead.
But there are probably also a few ways forward for the community. Causal (as opposed to acausal) cooperation requires some trust, so maybe the signal that there is a community of altruists that cooperate particularly well internally can be good if paired with the option of others to join that community by proving themselves to be sufficiently trustworthy. (That community may be wider than EA and called differently.) That would probably take the shape of newcomers making the case for new cause areas not necessarily based on their appeal to utilitarian values but based on their appeal to the values of the newcomer – alongside an argument that those values wouldn’t just turn into some form of utilitarianism upon idealization. That way, more value systems could gradually join this coalition, and we’d promote cooperation the way Scott recommends in the article. It’ll probably make sense to have different nested spheres of trust, though, with EA orgs at the center, the wider community around that, new aligned cooperators further outside, occasional mainstream cooperators further outside yet, etc. That way, the more high-trust spheres remain even if sphere’s further on the outside fail.
Finally, a lot of these things are easier in the acausal case that evidential cooperation in large worlds (ECL) is based on (once again, see Caspar Oesterheld’s paper). Perhaps ECL will turn out to make sufficiently strong recommendations that we’ll want to cooperate causally anyway despite any risk of causal defection against us. This stikes me as somewhat unlikely (e.g., many environmentalists may find ECL weird, so there may never be many evidential cooperators among them), but I still feel sufficiently confused about the implications of ECL that I find it at least worth mentioning.