Will Aldred

Karma: 4,589

Will Aldred May 3, 2024, 8:00 PM
26 points
10 ∶ 0
in reply to: William_S’s comment on: William_S’s Quick takes
Thank you for your work there. I’m curious about what made you resign, and also about why you’ve chosen now to communicate that?
(I expect that you are under some form of NDA, and that if you were willing and able to talk about why you resigned then you would have done so in your initial post. Therefore, for readers interested in some possibly related news: last month, Daniel Kokotajlo quit OpenAI’s Futures/Governance team “due to losing confidence that it [OpenAI] would behave responsibly around the time of AGI,” and a Superalignment researcher was forced out of OpenAI in what may have been a political firing (source). OpenAI appears to be losing its most safety-conscious people.)

Will Aldred Apr 29, 2024, 4:48 PM
10 points
3 ∶ 2
in reply to: Jason’s comment on: Motivation gaps: Why so much EA criticism is hostile and lazy
For what it’s worth, I endorse @Habryka’s old comment on this issue:
Man, this sure is a dicy topic, but I do think it’s pretty likely that Torres has a personality disorder, and that modeling these kinds of things is often important.
A while ago we had a conversation on the forum on whether Elon Musk might be (at least somewhat) autistic. A number of people pushed back on this as ungrounded speculation and as irrelevant in a way that seemed highly confused to me, since like, being autistic has huge effects on how you make decisions and how you relate to the world, and Musk has been a relevant player in many EA-adjacent cause areas for quite a while.
I do think there is some trickiness in talking about this kind of stuff, but talking about someone’s internal mental makeup can often be really important. Indeed, lots of people were saying to me in-person that they were modeling SBF as a sociopath, and implying that they would not feel comfortable giving that description in public, since that’s rude. I think in this case that diagnosis sure would have been really helpful and I think our norms against bringing up this kind of stuff harmed us quite a bit.
To be clear I am not advocating for a culture of psychologizing everyone. I think that’s terrible, and a lot of the worse interactions I’ve had with people external to the community have been people who have tried to dismiss various risks from artificial intelligence through various psychologizing lenses like “these people are power-obsessed, which is why they think an AI will want to dominate everyone”, which… are really not helpful and seem just straightforwardly very wrong to me, while also being very hard to respond to.
I don’t currently have a great proposal for norms for discussing this kind of stuff, especially as an attack (I feel less bad about the Elon autism discussion, since like, Elon identifies at least partially as autistic and I don’t think he would see it as an insult). Seems hard. My current guess is that it must be OK to at some point, after engaging extensively with someone’s object-level arguments, to bring up more psychologizing explanations and intuitions, but that it currently should come pretty late, after the object-level has been responded to and relatively thoroughly explored. I think this is the case with Torres, but not the case with many other people.

Will Aldred Apr 26, 2024, 9:28 PM
3 points
0 ∶ 0
in reply to: C Tilli’s comment on: Cooperative AI: Three things that confused me as a beginner (and my current understanding)
Oh, interesting, thanks for the link—I didn’t realize this was already an area of research. (I brought up my collusion idea with a couple of CLR researchers before and it seemed new to them, which I guess made me think that the idea wasn’t already being discussed.)

Will Aldred Apr 22, 2024, 2:22 AM
6 points
1 ∶ 0
in reply to: Habryka [Deactivated]’s comment on: Future of Humanity Institute 2005-2024: Final Report
Perhaps this old comment from Rohin Shah could serve as the standard link?
(Note that it’s on the particular case of recommending people do/don’t work at a given org, rather than the general case of praise/criticism, but I don’t think this changes the structure of the argument other than maybe making point 1 less salient.)
Excerpting the relevant part:
On recommendations: Fwiw I also make unconditional recommendations in private. I don’t think this is unusual, e.g. I think many people make unconditional recommendations not to go into academia (though I don’t).
I don’t really buy that the burden of proof should be much higher in public. Reversing the position, do you think the burden of proof should be very high for anyone to publicly recommend working at lab X? If not, what’s the difference between a recommendation to work at org X vs an anti-recommendation (i.e. recommendation not to work at org X)? I think the three main considerations I’d point to are:
1. (Pro-recommendations) It’s rare for people to do things (relative to not doing things), so we differentially want recommendations vs anti-recommendations, so that it is easier for orgs to start up and do things.
2. (Anti-recommendations) There are strong incentives to recommend working at org X (obviously org X itself will do this), but no incentives to make the opposite recommendation (and in fact usually anti-incentives). Similarly I expect that inaccuracies in the case for the not-working recommendation will be pointed out (by org X), whereas inaccuracies in the case for working will not be pointed out. So we differentially want to encourage the opposite recommendations in order to get both sides of the story by lowering our “burden of proof”.
3. (Pro-recommendations) Recommendations have a nice effect of getting people excited and positive about the work done by the community, which can make people more motivated, whereas the same is not true of anti-recommendations.
Overall I think point 2 feels most important, and so I end up thinking that the burden of proof on critiques / anti-recommendations should be lower than the burden of proof on recommendations—and the burden of proof on recommendations is approximately zero. (E.g. if someone wrote a public post recommending Conjecture without any concrete details of why—just something along the lines of “it’s a great place doing great work”—I don’t think anyone would say that they were using their power irresponsibly.)
I would actually prefer a higher burden of proof on recommendations, but given the status quo if I’m only allowed to affect the burden of proof on anti-recommendations I’d probably want it to go down to ~zero. Certainly I’d want it to be well below the level that this post meets.

Will Aldred Apr 16, 2024, 11:10 PM
2 points
0 ∶ 0
on: Cooperative AI: Three things that confused me as a beginner (and my current understanding)
Thanks, I found this post helpful, especially the diagram.
What (if any) is the overlap of cooperative AI […] and AI safety?
One thing I’ve thought about a little is the possiblility of there being a tension wherein making AIs more cooperative in certain ways might raise the chance that advanced collusion between AIs breaks an alignment scheme that would otherwise work.^[1]
1. ^
  I’ve not written anything up on this and likely never will; I figure here is as good a place as any to leave a quick comment pointing to the potential problem, appreciating that it’s but a small piece in the overall landscape and probably not the problem of highest priority.

Long Reflection Reading List

Will AldredMar 24, 2024, 4:27 PM

92 points

7 comments14 min readEA link

Metaculus Launches Future of AI Series, Based on Research Questions by Arb

christianMar 13, 2024, 9:14 PM

34 points

0 comments1 min readEA link

(www.metaculus.com)

Will Aldred Mar 8, 2024, 8:26 PM
8 points
3 ∶ 0
in reply to: metachirality’s comment on: NIST staffers revolt against expected appointment of ‘effective altruist’ AI researcher to US AI Safety Institute
Hard to tell from the information given. Two sources saying an unknown number of people are threatening to resign could just mean that two people are disgruntled and might themselves resign.

Will Aldred Mar 2, 2024, 8:16 PM
6 points
2 ∶ 0
in reply to: Vasco Grilo🔸’s comment on: Cooperating with aliens and AGIs: An ECL explainer
Hmm, okay, so it sounds like you’re arguing that even if we measure the curvature of our observable universe to be negative, it could still be the case that the overall universe is positively curved and therefore finite? But surely your argument should be symmetric, such that you should also believe that if we measure the curvature of our observable universe to be positive, it could still be the case that the overall universe is negatively curved and thus infinite?

Will Aldred Mar 2, 2024, 6:47 PM
7 points
0 ∶ 0
in reply to: Vasco Grilo🔸’s comment on: Cooperating with aliens and AGIs: An ECL explainer
Thanks for replying, I think I now understand your position a bit better. Okay, so if your concern is around measurements only being finitely precise, then my exactly-zero example is not a great one, because I agree that it’s impossible to measure the universe as being exactly flat.
Maybe a better example: if the universe’s large-scale curvature is either zero or negative, then it necessarily follows that it’s infinite.
—(I didn’t give this example originally because of the somewhat annoying caveats one needs to add. Firstly, in the flat case, that the universe has to be simply connected. And then in the negatively curved case, that our universe isn’t one of the unusual, finite types of hyperbolic 3-manifold given by Mostow’s rigidity theorem in pure math. (As far as I’m aware, all cosmologists believe that if the universe is negatively curved, then it’s infinite.))—
I think this new example might address your concern? Because even though measurements are only finitely precise, and contain uncertainty, you can still be ~100% confident that the universe is negatively curved based on measurement. (To be clear, the actual measurements we have at present don’t point to this conclusion. But in theory one could obtain measurements to justify this kind of confidence.)
(For what it’s worth, I personally have high credence in eternal inflation, which posits that there are infinitely many bubble/pocket universes, and that each pocket universe is negatively curved—very slightly—and infinitely large. (The latter on account of details in the equations.))

Will Aldred Mar 2, 2024, 5:15 PM
4 points
0 ∶ 0
in reply to: Vasco Grilo🔸’s comment on: Cooperating with aliens and AGIs: An ECL explainer
Hi Vasco, I’m having trouble parsing your comment. For example, if the universe’s large-scale curvature is exactly zero (and the universe is simply connected), then by definition it’s infinite, and I’m confused as to why you think it could still be finite (if this is what you’re saying; apologies if I’m misinterpreting you).
I’m not sure what kind of a background you already have in this domain, but if you’re interested in reading more, I’d recommend first going to the “Shape of the universe” Wikipedia page, and then, depending on your mileage, lectures 10–13 of Alan Guth’s introductory cosmology lecture series.

Evidential Cooperation in Large Worlds: Potential Objections & FAQ

ChiFeb 28, 2024, 6:58 PM

40 points

5 comments18 min readEA link

Everett branches, inter-light cone trade and other alien matters: Appendix to “An ECL explainer”

ChiFeb 24, 2024, 11:09 PM

26 points

1 comment11 min readEA link

Cooperating with aliens and AGIs: An ECL explainer

ChiFeb 24, 2024, 10:58 PM

54 points

9 comments20 min readEA link

Will Aldred Feb 21, 2024, 6:02 PM
12 points
2 ∶ 2
in reply to: SuperDuperForecasting’s comment on: New Open Philanthropy Grantmaking Program: Forecasting
I’m confused about why you think forecasting orgs should be trying to acquire commercial clients.^[1] How do you see this as being on the necessary path for forecasting initiatives to reduce x-risk, contribute to positive trajectory change, etc.? Perhaps you could elaborate on what you mean by “real-world impact”?
COI note: I work for Metaculus.
1. ^
  The main exception that comes to mind, for me, is AI labs. But I don’t think you’re talking about AI labs in particular as the commercial clients forecasting orgs should be aiming for?

Will Aldred Feb 18, 2024, 4:38 PM
4 points
0 ∶ 0
in reply to: VictorW’s comment on: VictorW’s Quick takes
What you describe in your first paragraph sounds to me like a good updating strategy, except I would say that you’re not updating your “natural independent opinion,” you’re updating your all-things-considered belief.
Related short posts I recommend—the first explains the distinction I’m pointing at, and the second shows how things can go wrong if people don’t track it:
- ‘Independent impressions’
- ‘When reporting AI timelines, be clear who you’re deferring to’

Will Aldred Feb 3, 2024, 9:22 PM
2 points
0 ∶ 0
in reply to: Ryan Greenblatt’s comment on: Matthew_Barnett’s Shortform
Fair point, I’ve added a footnote to make this clearer.

Will Aldred Feb 3, 2024, 7:56 PM
13 points
0 ∶ 0
in reply to: Matthew_Barnett’s comment on: Matthew_Barnett’s Shortform
AI x-risk is unique because humans would be replaced by other beings, rather than completely dying out. This means you can’t simply apply a naive argument that AI threatens total extinction of value
Paul Christiano wrote a piece a few years ago about ensuring that misaligned ASI is a “good successor” (in the moral value sense),^[1] as a plan B to alignment (Medium version; LW version). I agree it’s odd that there hasn’t been more discussion since.^[2]
Here’s a non-exhaustive list of guesses for why I think EAs haven’t historically been sympathetic [...]: A belief that AIs won’t be conscious, and therefore won’t have much moral value compared to humans.
I’ve wonderedabout this myself. My take is that this area was overlooked a year ago, but there’s now some good work being done. See Jeff Sebo’s Nov ’23 80k podcast episode, as well as Rob Long’s episode, and the paper that the two of them co-authored at the end of last year: “Moral consideration for AI systems by 2030”. Overall, I’m optimistic about this area becoming a new forefront of EA.
accelerationism would have, at best, temporary effects
I’m confused by this point, and for me this is the overriding crux between my view and yours. Do you really not think accelerationism could have permanent effects, through making AI takeover, or some other irredeemable outcome, more likely?
Are you sure there will ever actually be a “value lock-in event”?
I’m not sure there’ll be a lock-in event, in the way I can’t technically be sure about anything, but such an event seems clearly probable enough that I very much want to avoid taking actions that bring it closer. (Insofar as bringing the event closer raises the chance it goes badly, which I believe to be a likely dynamic. See, for example, the Metaculus question, “How does the level of existential risk posed by AGI depend on its arrival time?”, or discussion of the long reflection.)
1. ^
  Although, Paul’s argument routes through acausal cooperation—see the piece for details—rather than through the ASI being morally valuable in itself. (And perhaps OP means to focus on the latter issue.) In Paul’s words:
  Clarification: Being good vs. wanting good
  We should distinguish two properties an AI might have:
  - Having preferences whose satisfaction we regard as morally desirable.
  - Being a moral patient, e.g. being able to suffer in a morally relevant way.
  These are not the same. They may be related, but they are related in an extremely complex and subtle way. From the perspective of the long-run future, we mostly care about the first property.
2. ^
  There was a little discussion a few months ago, here, but none of what was said built on Paul’s article.

Will Aldred Feb 2, 2024, 12:44 AM
13 points
0 ∶ 0
on: Managing risks while trying to do good
Individually, altruists [...] can make a habit of asking themselves and others what risks they may be overlooking, dismissing, or downplaying.
Institutionally, we can rearrange organizational structures to take these individual tendencies into account, for example by creating positions dedicated to or focused on managing risk.
I’ve been surprised by how this seems to be a bit of a blind spot in our community.^[1] I’ve previously written a couple of comments—excerpted below—on this theme, about the state of community building. These garnered a decent number of upvotes, but I don’t think they led to any concrete actions or changes. (For instance, the second comment never received a reply from Open Phil.)
My attempts to raise this concern [about optimizing for numbers/hype at the expense of i) cause prio, ii) addressing particular talent bottlenecks, and iii) mitigating downside risks] with other community builders, including those above me, were mostly dismissed. This worried me. It seemed like the community building machine was not open to the hypothesis that (some of) what it was doing might be ineffective, or, worse, net negative. (More on the latter below.) On top of this, there seemed to be a tricky second-order effect at play: evaporative cooling whereby the community builders who developed concerns like mine exited, only to be replaced by more bullish community builders. The result: a disproportionately bullish community building machine. And there didn’t appear to be any countermeasures in place. For example, there was plenty of funding available if one wanted a paid role doing community building. But, in addition to the social disincentive, there was no funding available for evaluating/critiquing the impact of community building—at least, not that I was aware of.
(link)
There was near-consensus that Open Phil should generously fund promising AI safety community/movement-building projects they come across
Would you be able to say a bit about to what extent members of this working group have engaged with the arguments around AI safety movement-building potentially doing more harm than good? For instance, points 6 through 11 of Oli Habryka’s second message in the “Shutting Down the Lightcone Offices” post (link). If they have strong counterpoints to such arguments, then I imagine it would be valuable for these to be written up.
(link)
1. ^
  I mean, if one has a high prior on one’s actions being robustly positive, then it makes sense to continue full steam ahead without worrying about risks. (Because there is a tradeoff: spending time considering risks means spending less time acting.) However, I don’t think this level of confidence is warranted for the vast majority of longtermist interventions. For more, see this comment by Linch.

Will Aldred Jan 30, 2024, 5:19 PM
17 points
1 ∶ 0
in reply to: weeatquince’s comment on: EV investigation into Owen and Community Health
(Fwiw, the Forum moderation team does this for many of our cases.)

Will Aldred

Long Reflec­tion Read­ing List

Me­tac­u­lus Launches Fu­ture of AI Series, Based on Re­search Ques­tions by Arb

Ev­i­den­tial Co­op­er­a­tion in Large Wor­lds: Po­ten­tial Ob­jec­tions & FAQ

Everett branches, in­ter-light cone trade and other alien mat­ters: Ap­pendix to “An ECL ex­plainer”

Co­op­er­at­ing with aliens and AGIs: An ECL explainer

Long Reflection Reading List

Metaculus Launches Future of AI Series, Based on Research Questions by Arb

Evidential Cooperation in Large Worlds: Potential Objections & FAQ

Everett branches, inter-light cone trade and other alien matters: Appendix to “An ECL explainer”

Cooperating with aliens and AGIs: An ECL explainer