Yeah, I wasn’t being totally clear with respect to what I was really thinking in that context. I was thinking “from the point of view of people who have just been devastated by some not-exactly superintelligent but still pretty smart AI that wasn’t adequately controlled, people who want to make that never happen again, what would they assume is the prudent approach to whether there will be more non-aligned AI someday?”, figuring that they would think “Assume that if there are more, it is inevitable that there will be some non-aligned ones at some point”. The logic being that if we don’t know how to control alignment, there’s no reason to think there won’t someday be significantly non-aligned ones, and we should plan for that contingency.
James_Banks
[Question] What values would EA want to promote?
[Question] What do we do if AI doesn’t take over the world, but still causes a significant global problem?
In any case, both that quoted statement of yours and my tweaked version of it seem very different from the claim “if we don’t currently know how to align/control AIs, it’s inevitable there’ll eventually be significantly non-aligned AIs someday”?
Yes, I agree that there’s a difference.
I wrote up a longer reply to your first comment (the one marked “Answer’), but then I looked up your AI safety doc and realized that I might better read through the readings in that first.
I can see a scenario where BCI totalitarianism sounds like a pretty good thing from a hedonic utilitarian point of view:
People are usually more effective workers when they’re happy. So a pragmatic totalitarian government (like Brave New World) rather than a sadistic one or sadistic/pragmatic (1984, maybe) would want its people to be happy all the time, and would stimulate whatever in the brain makes them happy. To suppress dissent it would just delete thoughts and feelings in that direction as painlessly as possible. Competing governments would have an incentive to be pragmatic rather than sadistic.
Then the risk comes from the possibility that humans aren’t worth keeping around as workers, due to automation.
Interesting. A point I could get out of this is: “don’t take your own ideology too seriously, especially when the whole point of your ideology is to make yourself happy.”
An extreme hedonism (a really faithful one) is likely to produce outcomes like:
“I love you.”
“You mean, I give you pleasure?”
“Well, yeah! Duh!”
Which is a funny thing to say, kind of childish or childlike. (Or one could make the exchange be creepy: “Yeah, you mean nothing more to me than the pleasure you give me.”)
Do people really exist to each other?
I see a person X:
1. X has a body. --Okay, on that level they’re real.
2. I can form a mental model of X’s mind. --Good, I consider them a person.
3. X exists for me only relevant to the pleasure or pain they give to me. --No, on that level, all that exists to me is my pleasure or pain.
If I’m rigorously hedonistic, then at that deepest level (level 3 above), I am alone with my feelings and points of view. But Bentham maybe doesn’t want me to be rigorously hedonistic anyway.
Thinking back on books that have made a big effect on me, I think they were things which spoke to something already in me, maybe something genetic, to a large extent. It’s like I was programmed from birth to have certain life movements, and so I could immediately recognize what I read as the truth when it came to me—“that’s what I was always wanting to say, but didn’t know how!” I think that probably explains HP:MOR to a large extent (but I haven’t read HP:MOR).
My guess is that a large part of Yudkowsky’s motivation in writing the inspiring texts of the rationalist community was his big huge personality—him expressing himself. It happens that by doing that, he expressed a lot of other people’s personalities. I’m reminded of quotes (which unfortunately I can’t source at the moment) that I remember from David Bowie and John Lennon. David Bowie was accused of being powerful but he said “I’m not powerful. I’m an observer.” (which is actually a really powerful role). John Lennon said something like “Our power was in mainly just talking about our own lives” (vis a vis psychedelics, them getting into Eastern thinking, maybe other things) “and that’s a powerful thing.” Maybe Yudkowsky was really just talking about his life being mad at how the world isn’t an actually good place and how he personally was going to do something about it, and just seeing things that he personally found stupid about how other people thought about things (OK, that’s maybe a strawman of him ;-) ). I think whatever art you do will be potentially more powerful (if you’re lucky enough to get an audience) the deeper it comes from who you are, the more you take it personally.
Here’s a related quote from Eccentrics by David Weeks and Jamie James (pp. 67 − 68) (I think it’s Weeks speaking in the following quote:)
My own concept of creativity is that it is effective, empathic problem-solving. The part that empathy plays in this formulation is that it represents a transaction between the individual and the problem. (I am using the word “problem” loosely, as did Ghiselin: for an artist, the problem might be how to depict an apple.) The creative person displaces his point of view into the problem, investing it with something of his own intellect and personality, and even draws insights from it. He identifies himself with all the depths of the problem. Georges Braque expounded a version of this concept succinctly: “One must not just depict the objects, one must penetrate into them, and one must oneself become the object.”
This total immersion in the problem means that there is a great commitment to understand it at all costs, a deep commitment that recognizes no limits. In some cases the behavior that results can appear extreme by everyday standards. For example, when the brilliant architect Kiyo Izumi was designing a hospital for schizophrenics, he took LSD, which mimics some of the effects of schizophrenia, in order to understand the perceptual distortions of the people who would be living in the building. This phenomenon of total immersion is typical of eccentricity: overboard is the only way most eccentrics know how to go.
This makes me think: “You become the problem, and then at high stakes are forced to solve yourself, because now it’s a life or death situation for you.”
Working to save a life gives it value to you
Similarly, one could be concerned that the rapid economic growth that AI are expected to bring about could cause a lot of GHG emissions unless somehow we (or they) figure out how to use clean energy instead.
It looks like some people downvoted you, and my guess is that it may have to do with the title of the post. It’s a strong claim, but also not as informative as it could be, doesn’t mention anything to do with climate change or GHGs, for instance.
I think there’s a split between 1) “I personally will listen to brutal advice because I’m not going to let my feelings get in the way of things being better” and 2) “I will give brutal advice because other people’s feelings shouldn’t get in the way of things being better”. Maybe Holden wanted people to internalize 1 at the risk of engaging in 2. 2 may have been his way of promoting 1, a way of invalidating the feelings of his readers, who would go on to then be 1 people.
I’m pretty sure that there’s a way to be kind and honest, both in object-level discussion (“your charity is doing X wrong”) and in the meta discussion, of 1. (My possibly uninformed opinion:) Probably there needs to be a meeting in the middle: charities adopting 1 more and more, and funders finding away to be honest without 2. It takes effort for both to go against what is emotionally satisfying (the thinking nice things about yourself of anti-1, and the lashing out at frustrating immature people of 2). It takes effort to make that kind of change in both funder and charity culture (maybe something to work on for someone who’s appropriately talented?).
Also, this makes me curious: have things changed any since 2007? Does the promotion of 1 still seem as necessary? What role has the letter (or similar ideas/sentiments) played in whatever has happened with charities and funders over the last 13 years?
I like the idea of coming up with some kind of practice to retrain yourself to be more altruistic. There should be some version of that idea that works, and maybe exposing yourself to stories / imagery / etc. about people / animals who can be helped would be part of that.
One possibility is that such images could become naturally compelling for people (and thus would tend to be addictive or obsession-producing, because of their awful compellingness) -- for such people, this practice is probably bad, sometimes (often?) a net bad. But for other people, the images would lose their natural compellingness, and would have to be consumed deliberately.
In our culture we don’t train ourselves to deliberately meditate on things, so it feels “culturally unrealistic”, like something you can’t expect of yourself and the people around you. (Or perhaps some subtle interplay of environmental influences on how we develop as “processors of reality” when we’re growing up is to blame.) I feel like that part of me is more or less irrevocably closed over (maybe not an accurate sentiment, but a strong one). But in other cultures (not so much in the contemporary West), deliberate meditation was / is a thing. For instance people used to (maybe still do) meditate on the death of Jesus to motivate their love of God.
OK, this person on the EA subreddit uses a kind of meditation to reduce irrational/ineffective guilt.
1. I don’t know much about probability and statistics, so forgive me if this sounds completely naive (I’d be interested in reading more on this problem, if it’s as simple for you as saying “go read X”).
Having said that, though, I may have an objection to fanaticism, or something in the neighborhood of it:
Let’s say there are a suite of short-term payoff, high certainty bets for making things better.
And also a suite of long-term payoff, low certainty bets for making things better. (Things that promise “super-great futures”.)
You could throw a lot of resources at the low certainty bets, and if the certainty is low enough, you could get to the end of time and say “we got nothing for all that”. If the individual bets are low-certainty enough, even if you had a lot of them in your suite you would still have a very high probability of getting nothing for your troubles. (The state of coming up empty-handed.)
That investment could have come at the cost of pursuing the short-term, high certainty suite.
So you might feel regret at the end of time for not having pursued the safer bets, and with that in mind, it might be intuitively rational to pursue safe bets, even with less expected value. You could say “I should pursue high EV things just because they’re high EV”, and this “avoid coming up empty-handed” consideration might be a defeater for that.
You can defeat that defeater with “no, actually the likelihood of all these high-EV bets failing is low enough that the high-EV suite is worth pursuing.”
2. It might be equally rational to pursue safety as it is to pursue high EV, it’s just that the safety person and the high-EV person have different values.
3. I think in the real world, people do something like have a mixed portfolio, like Taleb’s advice of “expose yourself to high-risk, high-reward investments/experiences/etc., and also low-risk, low-reward.” And how they do that shows, practically speaking, how much they value super-great futures versus not coming up empty-handed. Do you think your paper, if it got its full audience, would do something like “get some people to shift their resources a little more toward high-risk, high-reward investments”? Or do you think it would have a more radical effect? (A big shift toward high-risk, high-reward? A real bullet-biting, where people do the bare minimum to survive and invest all other resources into pursuing super-high-reward futures?)
I can see the appeal in having one ontological world. What is that world, exactly? Is it that which can be proven scientifically (in the sense of, through the scientific method used in natural science)? I think what can be proven scientifically is perhaps what we are most sure is real or true. But things that we are less certain of being real can still exist, as part of the same ontological world. The uncertainty is in us, not in the world. One simplistic definition of natural science is that it is simply rigorous empiricism. The rigor isn’t how we are metaphysically connected with things, rather it’s the empirical that does so, the experiences contacting or occurring to observers. The rigor simply helps us interpret our experiences.
We can have random experiences that don’t add up to anything. But maybe whatever experiences that give rise to our concept “morality”, which we do seem to be able to discuss with some success with other people, and have done so in different time periods, may be rooted in a natural reality (which is not part of the deliverances of “natural science” as “natural” is commonly understood, but which is part of “natural science” if by “natural” we mean “part of the one ontological world”). Morality is something we try hard to make a science of (hence the field of ethics), but which to some extent eludes us. But that doesn’t mean that there isn’t something natural there, but that it’s something we have so far not figured out.
Moral realism can be useful in letting us know what kind of things should be considered moral.
For instance, if you ground morality in God, you might say: Which God? Well, if we know which one, we might know his/her/its preferences, and that inflects our morality. Also, if God partially cashes out to “the foundation of trustworthiness, through love”, then we will approach knowing and obligation themselves (as psychological realities) in a different way (less obsessive? less militant? or, perhaps, less rigorously responsible?).
Sharon Hewitt Rawlette (in The Feeling of Value) grounds her moral realism in “normative qualia”, which for her is something like “the component of pain that feels unacceptable” or its opposite in pleasure), which leads her to hedonic utilitarianism. Not to preference satisfaction or anything else, but specifically to hedonism.
I think both of the above are best grounded in a “naturalism” (a “one-ontological-world-ism” from my other comment), rather than in anything Enochian or Parfitian.
“King Emeric’s gift has thus played an important role in enabling us to live the monastic life, and it is a fitting sign of gratitude that we have been offering the Holy Sacrifice for him annually for the past 815 years.”
(source: https://sancrucensis.wordpress.com/2019/07/10/king-emeric-of-hungary/ )
It seems to me like longtermists could learn something from people like this. (Maintaining a point of view for 800 years, both keeping the values aligned enough to do this and being around to be able to.)
(Also a short blog post by me occasioned by these monks about “being orthogonal to history” https://formulalessness.blogspot.com/2019/07/orthogonal-to-history.html )
A few things this makes me think of:
explore vs. exploit: The first part of your life (the first 37%?), you gather information, then the last part, you use that information, maximizing and optimizing according to it. Humans have definite lifespans, but movements don’t. Perhaps a movement’s life depends somewhat on how much exploration they continue to do.
Christianity: I think maybe the only thing all professed Christians have in common is attraction to Jesus, who is vaguely or definitely understood. You could think of Christianity as a movement of submovements (denominations). The results are these nicely homogenous groups. There’s a Catholic personality or personality-space, a Methodist, Church of Christ, Baptist, etc. Within them are more, or less, autonomous congregations. Congregations die all the time. Denominations wax and wane. Over time, what used to divide people into denominations (doctrinal differences) has become less relevant (people don’t care about doctrine as much anymore), and new classification criteria connect and divide people along new lines (conservative vs. evangelical vs. mainline vs. progressive). An evangelical Christian family who attend a Baptist church might see only a little problem in switching to a Reformed church that was also evangelical. A Church of Christ member, at a church that would have considered all Baptists to not really be Christians 50 or 100 years ago, listens to some generic non-denominational nominally Baptist preacher who says things he likes to hear, while also hearing the more traditional Church of Christ sermons on Sunday morning.
The application of that example to EA could be something like: Altruism with a capital-A is something like Jesus, a resonant image. Any Altruist ought to be on the same side as any other Altruist, just like any Christian ought to be on the same side as any other Christian, because they share Altruism, or Jesus. Just as there is an ecosystem of Christian movements, submovements, and semiautonomous assemblies, there could be an ecosystem of Altruistic movements, submovements, and semiautonomous groups. It could be encouraged or expected of Altruists that they each be part of multiple Altruistic movements, and thus be exposed to all kinds of outside assumptions, all within some umbrella of Altruism. In this way, within each smaller group, there can be homogeneity. The little groups that exploit can run their course and die while being effective tools in the short- or medium-term, but the overall movement or megamovement does not, because overall it keeps exploring. And, as you point out, continuing to explore improves the effectiveness of altruism. Individual movements can be enriched and corrected by their members’ memberships in other movements.
A Christian who no longer likes being Baptist can find a different Christianity. So it could be the same with Altruists. EAs who “value drift” might do better in a different Altruism, and EA could recruit from people in other Altruisms who felt like moving on from those.
Capital-A Altruism should be defined in a minimalist way in order to include many altruistic people from different perspectives. EAs might think of whatever elements of their altruism that are not EA-specific as a first approximation of Altruism. Once Altruism is defined, it may turn out that there are already a number of existing groups that are basically Altruistic, though having different cultures and different perspectives than EA.
Little-a altruism might be too broad for compatibility with EA. I would think that groups involved in politicizing go against EA’s ways. But then, maybe having connection even with them is good for Altruists.
In parallel to Christianity, when Altruism is at least somewhat defined, then people will want to take the name of it, and might not even be really compliant with the N Points of Altruism, whatever value of N one could come up with—this can be a good and a bad thing, better for diversity, worse for brand strength. But also in parallel to Christianity, there is generally a similarity within professed Christians which is at least a little bit meaningful. Experienced Christians have some idea of how to sort each other out, and so it could be with Altruists. Effective Altruism can continue to be as rigorously defined as it might want to be, allowing other Altruisms to be different.