I am Issa Rice. https://issarice.com/
riceissa
I agree with most of the points in this post (AI timelines might be quite short; probability of doom given AGI in a world that looks like our current one is high; there isn’t much hope for good outcomes for humanity unless AI progress is slowed down somehow). I will focus on one of the parts where I think I disagree and which feels like a crux for me on whether advocating AI pause (in current form) is a good idea.
You write:
But we can still have all the nice things (including a cure for ageing) without AGI; it might just take a bit longer than hoped. We don’t need to be risking life and limb driving through red lights just to be getting to our dream holiday a few minutes earlier.
I think framings like these do a misleading thing where they use the word “we” to ambiguously refer to both “humanity as a whole” and “us humans who are currently alive”. The “we” that decides how much risk to take is the humans currently alive, but the “we” that enjoys the dream holiday might be humans millions of years in the future.
I worry that “AI pause” is not being marketed honestly to the public. If people like Wei Dai are right (and I currently think they are), then AI development may need to be paused for millions of years potentially, and it’s unclear how long it will take unaugmented or only mildly augmented humans to reach longevity escape velocity.
So to a first approximation, the choice available to humans currently alive is something like:
Option A: 10% chance utopia within our lifetime (if alignment turns out to be easy) and 90% human extinction
Option B: ~100% chance death but then our descendants probably get to live in a utopia
For philosophy nerds with low time preference and altruistic tendencies (into which I classify many EA people and also myself), Option B may seem obvious. But I think many humans existing today would rather risk it and just try to build AGI now, rather than doing any AI pause, and to the extent that they say they prefer pause, I think they are being deceived by the marketing or acting under Caplanian Principle of Normality, or else they are somehow better philosophers than I expected they would be.
(Note: if you are so pessimistic about aligning AI without a pause that your probability on that is lower than the probability of unaugmented present-day humans reaching longevity escape velocity, then Option B does seem like a strictly better choice. But the older and more unhealthy you are, the less this applies to you personally.)
I’ve wondered about this for independent projects and there’s some previous discussion here.
See also the shadows of the future term that Michael Nielsen uses.
I think a general and theoretically sound approach would be to build a single composite game to represent all of the games together
Yeah, I did actually have this thought but I guess I turned it around and thought: shouldn’t an adequate notion of value be invariant to how I decide to split up my games? The linearity property on Wikipedia even seems to be inviting us to just split games up in however manner we want.
And yeah, I agree that in the real world games will overlap and so there will be double counting going on by splitting games up. But if that’s all that’s saving us from reaching absurd conclusions then I feel like there ought to be some refinement of the Shapley value concept...
I asked my question because the problem with infinities seems unique to Shapley values (e.g. I don’t have this same confusion about the concept of “marginal value added”). Even with a small population, the number of cooperative games seems infinite: for example, there are an infinite number of mathematical theorems that could be proven, an infinite number of Wikipedia articles that could be written, an infinite number of films that could be made, etc. If we just use “marginal value added”, the total value any single person adds is finite across all such cooperative games because in the actual world, they can only do finitely many things. But the Shapley value doesn’t look at just the “actual world”, it seems to look at all possible sequences of ways of adding people to the grand coalition and then averages the value, so people get non-zero Shapley value assigned to them even if they didn’t do anything in the “actual world”.
(There’s maybe some sort of “compactness” argument one could make that even if there are infinitely many games, in the real world only finitely many of them get played to completion and so this should restrict the total Shapley value any single person can get, but I’m just trying to go by the official definition for now.)
I don’t think the example you give addresses my point. I am supposing that Leibniz could have also invented calculus, so . But Leibniz could have also invented lots of different things (infinitely many things!), and his claim to each invention would be valid (although in the real world he only invents finitely many things). If each invention is worth at least a unit of value, his Shapley value across all inventions would be infinite, even if Leibniz was “maximally unluckly” and in the actual world got scooped every single time and so did not invent anything at all.
I don’t understand the part about self-modifications—can you spell it out in more words/maybe give an example?
Disagree-voting a question seems super aggressive and also nonsensical to me. (Yes, my comment did include some statements as well, but they were all scaffolding to present my confusion. I wasn’t presenting my question as an opinion, as my final sentence makes clear.) I’ve been unhappy with the way the EA Forum has been going for a long time now, but I am noting this as a new kind of low.
What numerator and denominator? I am imagining that a single person could be a player in multiple cooperative games. The Shapley value for the person would be finite in each game, but if there are infinitely many games, the sum of all the Shapley values (adding across all games, not adding across all players in a single game) could be infinite.
Example 7 seems wild to me. If the applicants who don’t get the job also get some of the value, does that mean people are constantly collecting Shapley value from the world, just because they “could” have done a thing (even if they do absolutely nothing)? If there are an infinite number of cooperative games going on in the world and someone can plausibly contribute at least a unit of value to any one of them, then it seems like their total Shapley value across all games is infinite, and at that point it seems like they are as good as one can be, all without having done anything. I can’t tell if I’m making some sort of error here or if this is just how the Shapley value works.
Do you know of any ways I could experimentally expose myself to extreme amounts of pleasure, happiness, tranquility, and truth?
I’m not aware of any way to expose yourself to extreme amounts of pleasure, happiness, tranquility, and truth that is cheap, legal, time efficient, and safe. That’s part of the point I was trying to make in my original comment. If you’re willing forgo some of those requirements, then as Ian/Michael mentioned, for pleasure and tranquility I think certain psychedelics (possibly illegal depending on where you live, possibly unsafe, and depending on your disposition/luck may be a terrible idea) and meditation practices (possibly expensive, takes a long time, possibly unsafe) could be places to look into. For truth, maybe something like “learning all the fields and talking to all the people out there” (expensive, time-consuming, and probably unsafe/distressing), though I realize that’s a pretty unhelpful suggestion.
I’d be willing to expose myself to whatever you suggest, plus extreme suffering, to see if this changes my mind. Or we can work together to design a different experimental setup if you think that would produce better evidence.
I appreciate the offer, and think it’s brave/sincere/earnest of you (not trying to be snarky/dismissive/ironic here—I really wish more people had more of this trait that you seem to possess). My current thinking though is that humans need quite a benign environment in order to stay sane and be able to introspect well on their values (see discussion here, where I basically agree with Wei Dai), and that extreme experiences in general tend to make people “insane” in unpredictable ways. (See here for a similar concern I once voiced around psychedelics.) And even a bunch of seemingly non-extreme experiences (like reading the news, going on social media, or being exposed to various social environments like cults and Cultural Revolution-type dynamics) seem to have historically made a bunch of people insane and continue to make people insane. Basically, although flawed, I think we still have a bunch of humans around who are still basically sane or at least have some “grain of sanity” in them, and I think it’s incredibly important to preserve that sanity. So I would probably actively discourage people from undertaking such experiments in most cases.
It may end up being that such intensely positive values are possible in principle and matter as much as intense pains, but they don’t matter in practice for neartermists, because they’re too rare and difficult to induce. Your theory could symmetrically prioritize both extremes in principle, but end up suffering-focused in practice. I think the case for upside focus in longtermism could be stronger, though.
If by “neartermism” you mean something like “how do we best help humans/animals/etc who currently exist using only technologies that currently exist, while completely ignoring the fact that AGI may be created within the next couple of decades” or “how do we make the next 1 year of experiences as good as we can while ignoring anything beyond that” or something along those lines, then I agree. But I guess I wasn’t really thinking along those lines since I find that kind of neartermism either pretty implausible or feel like it doesn’t really include all the relevant time periods I care about.
It’s also conceivable that pleasurable states as intense as excruciating pains in particular are not possible in principle after refining our definitions of pleasure and suffering and their intensities.
I agree with you that that is definitely conceivable. But I think that, as Carl argued in his post (and elaborated on further in the comment thread with gwern), our default assumption should be that efficiency (and probably also intensity) of pleasure vs pain is symmetric.
I am worried that exposing oneself to extreme amounts of suffering without also exposing oneself to extreme amounts of pleasure, happiness, tranquility, truth, etc., will predictably lead one to care a lot more about reducing suffering compared to doing something about other common human values, which seems to have happened here. And the fact that certain experiences like pain are a lot easier to induce (at extreme intensities) than other experiences creates a bias in which values people care the most about.
Carl Shulman made a similar point in this post: “This is important to remember since our intuitions and experience may mislead us about the intensity of pain and pleasure which are possible. In humans, the pleasure of orgasm may be less than the pain of deadly injury, since death is a much larger loss of reproductive success than a single sex act is a gain. But there is nothing problematic about the idea of much more intense pleasures, such that their combination with great pains would be satisfying on balance.”
Personally speaking, as someone who has been depressed and anxious most of my life and sometimes have (unintentionally) experienced extreme amounts of suffering, I don’t currently find myself caring more about pleasure/happiness compared to pain/suffering (I would say I care about them roughly the same). There’s also this thing I’ve noticed where sometimes when I’m suffering a lot, the suffering starts to “feel good” and I don’t mind it as much, and symmetrically, when I’ve been happy the happiness has started to “feel fake” somehow so overall I feel pretty confused about what terminal values I am even optimizing for (but thankfully it seems like on the current strategic landscape I don’t need to figure this out immediately).
Has Holden written any updates on outcomes associated with the grant?
Not to my knowledge.
I don’t think that lobbying against OpenAI, other adversarial action, would have been that hard.
It seems like once OpenAI was created and had disrupted the “nascent spirit of cooperation”, even if OpenAI went away (like, the company and all its employees magically disappeared), the culture/people’s orientation to AI stuff (“which monkey gets the poison banana” etc.) wouldn’t have been reversible. So I don’t know if there was anything Open Phil could have done to OpenAI in 2017 to meaningfully change the situation in 2022 (other than like, slowing AI timelines by a bit). Or maybe you mean some more complicated plan like ‘adversarial action against OpenAI and any other AI labs that spring up later, and try to bring back the old spirit of cooperation, and get all the top people into DeepMind instead of spreading out among different labs’.
Eliezer’s tweet is about the founding of OpenAI, whereas Agrippa’s comment is about a 2017 grant to OpenAI (OpenAI was founded in 2015, so this was not a founding grant). It seems like to argue that Open Phil’s grant was net negative (and so strongly net negative as to swamp other EA movement efforts), one would have to compare OpenAI’s work in a counterfactual world where it never got the extra $30 million in 2017 (and Holden never joined the board) with the actual world in which those things happened. That seems a lot harder to argue for than what Eliezer is claiming (Eliezer only has to compare a world where OpenAI didn’t exist vs the actual world where it does exist).
Personally, I agree with Eliezer that the founding of OpenAI was a terrible idea, but I am pretty uncertain about whether Open Phil’s grant was a good or bad idea. Given that OpenAI had already disrupted the “nascent spirit of cooperation” that Eliezer mentions and was going to do things, it seems plausible that buying a board seat for someone with quite a bit of understanding of AI risk is a good idea (though I can also see many reasons it could be a bad idea).
One can also argue that EA memes re AI risk led to the creation of OpenAI, and that therefore EA is net negative (see here for details). But if this is the argument Agrippa wants to make, then I am confused why they decided to link to the 2017 grant.
What textbooks would you recommend for these topics? (Right now my list is only “Linear Algebra Done Right”)
I would recommend not starting with Linear Algebra Done Right unless you already know the basics of linear algebra. The book does not cover some basic material (like row reduction, elementary matrices, solving linear equations) and instead focuses on trying to build up the theory of linear algebra in a “clean” way, which makes it enlightening as a second or third exposure to linear algebra but a cruel way to be introduced to the subject for the first time. I think 3Blue1Brown videos → Vipul Naik’s lecture notes → 3Blue1Brown videos (again) → Gilbert Strang-like books/Treil’s Linear Algebra Done Wrong → 3Blue1Brown videos (yet again) → Linear Algebra Done Right would provide a much smoother experience. (See also this comment that I wrote a while ago.)
Many domains that people tend to conceptualize as “skill mastery, not cult indoctrination” also have some cult-like properties like having a charismatic teacher, not being able to question authority (or at least, not being encouraged to think for oneself), and a social environment where it seems like other students unquestioningly accept the teachings. I’ve personally experienced some of this stuff in martial arts practice, math culture, and music lessons, though I wouldn’t call any of those a cult.
Two points this comparison brings up for me:
EA seems unusually good compared to these “skill mastery” domains in repeatedly telling people “yes, you should think for yourself and come to your own conclusions”, even at the introductory levels, and also just generally being open to discussions like “is EA a cult?”.
I’m worried this post will be condensed into people’s minds as something like “just conceptualize EA as a skill instead of this cult-like thing”. But if even skill-like things have cult-like elements, maybe that condensed version won’t help people make EA less cult-like. Or maybe it’s actually okay for EA to have some cult-like elements!
He was at UW in person (he was a grad student at UW before he switched his PhD to AI safety and moved back to Berkeley).
Setting expectations without making it exclusive seems good.
“Seminar program” or “seminar” or “reading group” or “intensive reading group” sound like good names to me.
I’m guessing there is a way to run such a group in a way that both you and I would be happy about.
The actual activities that the people in a fellowship engage in, like reading things and discussing them and socializing and doing giving games and so forth, don’t seem different from what a typical reading club or meetup group does. I am fine with all of these activities, and think they can be quite valuable.
So how are EA introductory fellowships different from a bare reading club or meetup group? My understanding is that the main differences are exclusivity and the branding. I’m not a fan of exclusivity in general, but especially dislike it when there doesn’t seem to be a good reason for it (e.g. why not just split the discussion into separate circles if there are too many people?) or where self-selection would have worked (e.g. making the content of the fellowship more difficult so that the less interested people will leave on their own). As for branding, I couldn’t find a reason why these groups are branded as “fellowships” in any of the pages or blog posts I looked at. But my guess is that it is a way to manufacture prestige for both the organizers/movement and for the participants. This kind of prestige-seeking seems pretty bad to me. (I can elaborate more on either point if you want to understand my reasoning.)
I haven’t spent too much time looking into these fellowships, so it’s quite possible I am misunderstanding something, and would be happy to be corrected.
- 25 Dec 2021 18:10 UTC; 19 points) 's comment on We should be paying Intro Fellows by (
I didn’t. As far as I know, introductory fellowships weren’t even a thing in EA back in 2014 (or if they were, I don’t remember hearing about them back then despite reading a bunch of EA things on the internet). However, I have a pretty negative opinion of these fellowships so I don’t think I would have wanted to start one even if they were around at the time.
I was indeed simplifying, and e.g. probably should have said “global catastrophe” instead of “human extinction” to cover cases like permanent totalitarian regimes. I think some of the scenarios you mention could happen, but also think a bunch of them are pretty unlikely, and also disagree with your conclusion that “The bulk of the probability lies somewhere in the middle”. I might be up for discussing more specifics, but also I don’t get the sense that disagreement here is a crux for either of us, so I’m also not sure how much value there would be in continuing down this thread.