Detachment vs attachment [AI risk and mental health]
“What? Why would I choose cosmic energy over Katara?”
Avatar the Last Airbender
“You idiot” said the monk from atop his mountain, “there’s all this fresh air up here and the view is breathtaking and you’re down there dredging mud .” The mud-dredger only grit his teeth. After finishing the foundations, he cut down some lumber and built a cabin. That winter, the mountaintop monk died of starvation and was eaten by bears.
There is a common trope: that to achieve “inner peace”, you must to some degree disentangle your desires with the atoms around you. E.g. caring about wealth or social status implies entangling your goals with the material world, which is Bad because it means externalizing your locus of control.
I call this “being the mountaintop monk”. A mountaintop monk disentangles their desires with atomic reality to such an extreme degree that they end up retiring to a monastery in Tibet and spending the rest of their life meditating in the lotus position and being mysterious at people.
A lot of credit is given to the Way of the mountaintop monk. Cached thoughts like “one must make peace with death” or “but is human extinction really a bad thing?” are in accordance with his general philosophy. I can imagine most of my friends listening to this description and nod along thinking “this is wise”.
There are, however, significant disadvantages that come with being a mountaintop monk. For instance, if you believe in AI x-risk, you understand that there is nowhere you can hide from an intelligent threat. You cannot hide at the top of a mountain and meditate your way into Nirvana to escape death. If your desires are in any way still entangled with reality (you might have loved ones to protect) then retiring to a mountain is not a good plan.
The best example in fiction I can think of is a scene in Avatar the Last Airbender in which Aang, in order to reach the avatar state on demand,[1] must relinquish his “attachment to this world”, namely the person he’s in love with. “What?” Aang exclaims, “why would I choose cosmic energy over Katara?”[2]
I know a guy who believes that humanity and all his loved ones don’t have long to live. But he holds this belief while not taking the necessary offensive steps to prevent them from dying. I suspect part of the explanation is that taking a lot of 5-meo DMT makes you lose an intuition for consequentialism. From my perspective, he has drugged himself into a permanent disentanglement between his values and the atoms outside of his head.
I am not like this. I am a mud-dredger (that’s how it feels sometimes), not a mountaintop monk. I’m entangled enough with reality that I think you’d have to be pretty loony not to do everything in your power to increase your loved one’s log odds of survival.
This leads me to actually do things that increase my loved one’s log odds, like, hopefully, publishing this post.
It’s not always fun.
Entangling your desires with atoms means making yourself more vulnerable to that which you cannot control.
If your feedback loops are too loose, it can be emotionally draining to pursue a goal that is only abstract in essence.
Being a mud-dredger will expose you to impossible problems.
When “well-being” is no longer the end—when the end is elsewhere, stored on atoms—that is when you become most vulnerable to anxiety, and frustration, and desperation.
Using words like “I should” or “here’s the plan” implies entangling your brain with outside atoms instead of letting it do its thing. That restricts your thoughtspace because you are aiming it, which can impede creativity.[3]
Mud-dredging does improve your rationality, however. That’s why betting works. When the outcome of your thoughts is entangled with your bank account— or your daughter’s life—you had better be maximizing stamps, not stampyness. Because you can no longer afford to treat what happens inside your 1400 cubic centimeters of cranial compute as your end, you must make that compute take the shape of reality. You have to go on the offensive, and that means slamming into reality (which is as pleasant as it sounds).
It is a spectrum. Sometimes you’ll feel a lot more like a mountaintop monk than a mud-dredger, other times the opposite. It is not a good idea to be entirely a mud-dredger. At least, it’s certainly not the route to good mental health. Meditation, sabbath, and gratitude are all mountaintop virtues that are worth pursuing. But mountaintopping in general will pull you away from anything you care about that runs on atoms. That may not be a possibility for you.
- ^
A goal originally spurred by a consequentialist desire to save the world, mind you!
- ^
Somehow, the full scene is available on YouTube.
- ^
Sometimes, you might not want to begin your thinking by entangling thoughts with atoms. Letting your mind wander has become clichéd enough advice that you no longer think to apply it.
- ^
So: slam into reality more often.
Two things:
1. I think of “Invested but not attached [to the outcome]” as a pareto-optimal strategy that is neither attached nor detached.
2. I disagree with the second to last paragraph, “Mud-dredging does improve your rationality, however. That’s why betting works.” I think that if you’re escaping in the mountains, then it’s true that coming down from the mountain will give you actual data and some degree of accountability. But it’s not obvious to me that 1) mud-dredging increases rationality, 2) the kind of rationality that mud-dredging maybe increases is actually more beneficial than harmful in the long run in terms of performance. Furthermore, among all the frameworks out there in terms of mental health or productivity, I believe that creativity is almost universally valued as a thing to foster more than rationality, in terms of performance/success, so I’m curious about where you’re coming from.
Hello! Thanks for commenting!
How does that work? In your specific case, what are you invested in but also are detached from the outcome? I can imagine enjoying life as working like this: eg I don’t care what I’m learning about if I’m reading a book for pleasure. Parts of me also enjoy the work I tell myself helps with AI safety. But there are certainly some parts of it that I dislike, but that I do anyway, because I attach a lot of importance to the outcome.
Those are interesting points!
1) Mud-dredging makes rationality a necessity. If you’ve taken DMT and have had a cosmic revelation where you discovered that everything is connected and death is an illusion, then you don’t need to actively not die. I know people to whom death or life is all the same: my point is that if you care about the life/death outcome, you must be on the offensive, somewhat. If you sit in the same place for long enough, you die. There are posts about “rationality = winning”, and I’m not going to get into semantics but what I meant here by rationality was “that which gets what you want”. You can’t afford to eg ignore truth when something you value is at risk. Part of it was referencing this post, which made clear for me that entangling my rationality with reality more thoroughly would force me into improving it.
2) I’m not sure what you mean. We may be talking about two different things: what I meant by “rationality” was specifically what gets you good performance. I didn’t mean some daily applied system which has both pros and cons to mental health or performance. I’m thinking about something wider than that.
As for that last point, I seem to have regrettably framed creativity and rationality as mutually incompatible. I wrote in the drawbacks of muddredging that aiming at something can impede creativity, which I think is true. The solution for me is splitting time up into “should” injunctions time and free time fooling around. Not a novel solution or anything. Again it’s a spectrum, so I’m not advocating for full-on muddgredging: that would be bad for performance (and mental health) in the long run. This post is the best I’ve read that explores this failure mode. I certainly don’t want to appear like I’m disparaging creativity.
(However, I do think that rationality is more important than creativity. I care more about making sure my family members don’t die than about me having fun, and so when I reflect on it all I decide that I’ll be treating creativity as a means, not an end, for the time being. It’s easy to say I’ll be using creativity as a means, but in practice, I love doing creative things and so it becomes an end.)
An example of invested but not attached: I’m investing time/money/energy into taking classes about subject X. I chose subject X because it could help me generate more value Y that I care about. But I’m not attached to getting good at X, I’m invested in the process of learning it.
I feel more confused after reading your other points. What is your definition of rationality? Is this definition also what EA/LW people usually mean? (If so, who introduces this definition?)
When you say rationally is “what gets you good performance”, that seems like it could lead to arbitrary circular reasoning about what is and isn’t rational. If I exaggerate this concern and define rationality as “what gets you the best life possible”, that’s not a helpful definition because it leads to the unfalsifiable claim that rationality is optimal while providing no practical insight.
Okay forget what I said, I sure can tie myself up in knots. Here’s another attempt:
If a person is faced with the decision to either save 100 out of 300 people for sure, or have a 60% chance of saving everyone, they are likely (in my experience asking friends) to answer something like “I don’t gamble with human lives” or “I don’t see the point of thought experiments like this”. Eliezer Yudkowsky claims in his “something to protect” post that if those same people were faced with this problem and a loved one was among the 300, they would have more incentive to ‘shut up and multiply’. People are more likely to choose what has more expected value if they are more entangled with the end result (and less likely to eg signal indignation at having to gamble with lives).
I see this in practice, and I’m sure you can relate: I’ve often been told by family members that putting numbers on altruism takes the whole spirit out of it, or that “malaria isn’t the only important thing, coral is important too! ” , or that “money is complicated and you can’t equate wasted money with wasted opportunities for altruism”.
These ideas look perfectly reasonable to them but I don’t think they would hold up for a second if their child had cancer: “putting numbers on cancer treatment for your child takes the whole spirit out of saving them (like you could put a number on love)”, or “your child surviving isn’t the only important thing, coral is important too” or “money is complicated, and you can’t equate wasting money with spending less on your child’s treatment”.
Those might be a bit personal. My point is that entangling the outcome with something you care about makes you more likely to try making the right choice. Perhaps I shouldn’t have used the word “rationality” at all. “Rationality” might be a valuable component in making the right choice, but for my purposes I only care about making the right choice no matter how you get there.
The practical insight is that you should start by thinking about what you actually care about, and then backchain from there. If I start off deciding that I want to maximize my family’s odds of survival, I think I am more likely to take AI risk seriously (in no small part, I think, because signalling sanity by scoffing at ‘sci-fi scenarios’ is no longer something that matters).
I am designing a survey I will send tonight to some university students to test this claim.
Nate Soares excellently describes this process.