Will Aldred

Karma: 4,653

Will Aldred Sep 23, 2023, 8:28 PM
3 points
0 ∶ 0
in reply to: jacquesthibs’s comment on: jacquesthibs’s Shortform
According to the debate week announcement, Scott Alexander will be writing a summary/conclusion post.

Will Aldred Sep 22, 2023, 10:03 PM
10 points
1 ∶ 1
on: Will Aldred’s Shortform
One thing the AI Pause Debate Week has made salient to me: there appears to be a mismatch between the kind of slowing that on-the-ground AI policy folks talk about, versus the type that AI policy researchers and technical alignment people talk about.
My impression from talking to policy folks who are in or close to government—admittedly a sample of only five or so—is that the main^[1] coordination problem for reducing AI x-risk is about ensuring the so-called alignment tax gets paid (i.e., ensuring that all the big labs put some time/money/effort into safety, and that none “defect” by skimping on safety to jump ahead on capabilities). This seems to rest on the assumption that the alignment tax is a coherent notion and that technical alignment people are somewhat on track to pay this tax.
On the other hand, my impression is that technical alignment people, and AI policy researchers at EA-oriented orgs,^[2] are not at all confident in there being a viable level of time/money/effort that will produce safe AGI on the default trajectory. The type of policy action that’s needed, so they seem to say, is much more drastic. For example, something in the vein of global coordination to slow, limit, or outright stop development and deployment of AI capabilities (see, e.g., Larsen’s,^[3] Bensinger’s, and Stein-Perlman’s debate week posts), whilst alignment researchers scramble to figure out how on earth to align frontier systems.
I’m concerned by this mismatch. It would appear that the game plans of two adjacent clusters of people working to reduce AI x-risk are at odds. (Clearly, this is an oversimplification and there are a range of takes from within both clusters, but my current epistemic status is that this oversimplification gestures at a true and important pattern.)
Am I simply mistaken about there being a mismatch here? If not, is anyone working to remedy the situation? Or does anyone have thoughts on how this arose, how it could be rectified, or how to prevent similar mismatches from arising in the future?
1. ^
  In the USA, this main is served with a hearty side order of “Let’s make sure China in particular never races ahead on capabilities.”
2. ^
  e.g., Rethink Priorities, AI Impacts
3. ^
  I’m aware that Larsen recently crossed over into writing policy bills, but I’m counting them as a technical person on account of their technical background and their time spent in the Berkeley sphere of technical alignment people. Nonetheless, perhaps crossovers like this are a good omen for policy and technical people getting onto the same page.

Will Aldred Sep 22, 2023, 11:43 AM
12 points
8 ∶ 3
in reply to: Linch’s comment on: Linch’s Shortform
We’re issuing a warning for this comment for breaking our Forum norm on civility. We don’t think it was meant to be insulting, based on Linch’s previous Twitter poll (created months ago) and the fact that he himself is not a native speaker. However, we think the stark difference between the Twitter poll and responses here shows that this comment was widely taken as insulting, even if that wasn’t the intent. (I certainly saw it that way before reading the Twitter poll.)
A subsequent comment (“I at least made an effort to understand the language when I immigrated”) was more obviously an attack on titotal, and contributed to this warning.
Linch is an extremely active Forum user whose contributions have been vastly beneficial on net, and this strikes us as an uncharacteristic lapse. A warning doesn’t mean that someone hasn’t been a valuable member of the Forum; however, being a valuable member of the Forum doesn’t insulate someone from moderator action in cases like this.
We feel grateful to the community for responding productively to this situation.
On the constructive side, it’s hard to say what a “better” version of this comment would have looked like; even comments like “I think there may be a language barrier” still imply something along the lines of “you understand me so poorly that I think you may not be fluent in English”.
In the end, we think the best response to a confusing argument is to engage at the points of confusion (if that seems worthwhile), or ignore it (if not).
Another member of our team drafted these sample comments — not as “you should have said exactly X”, but “here’s one shape a better response could have taken”:
Engaging:
From your comment, you seem to think I’m arguing that public investigation is net harmful because the individual cost outweighs the collective benefit. That’s not my argument. Instead, I think that the collective benefit would be higher if someone took the time/energy spent on one public investigation and used it for many private investigations, because private investigations can also be valuable and are much easier to conduct.
It’s important to compare action not just to the null case (“is this better than nothing?”), but also to other possible actions (“is this the best way to handle problem X?”).
Does that response make sense? Did I misread your objection?
Ignoring:
This comment doesn’t address my argument; I think you may have misunderstood me. I don’t plan to engage further.

Will Aldred Sep 18, 2023, 4:43 PM
5 points
2 ∶ 0
in reply to: Greg_Colbourn ⏸️ ’s comment on: Thresholds #1: What does good look like for longtermism?
Directionally, I agree with your points. On the last one, I’ll note that counting person-years (or animal-years) falls naturally out of empty individualism as well as open individualism, and so the point goes through under the (substantively) weaker claim of “either open or empty individualism is true”.^[1]
(You may be interested in David Pearce’s take on closed, empty, and open individualism.)
1. ^
  For the casual reader: The three candidate theories of personal identity are empty, open, and closed individualism. Closed is the common sense view, but most people who have thought seriously about personal identity—e.g., Parfit—have concluded that it must be false (tl;dr: because nothing, not memory in particular, can “carry” identity in the way that’s needed for closed individualism to make sense). Of the remaining two candidates, open appears to be the fringe view—supporters include Kolak, Johnson, Vinding, and Gomez-Emilsson (although Kolak’s response to Cornwall makes it unclear to what extent he is indeed a supporter). Proponents of (what we now call) empty individualism include Parfit, Nozick, Shoemaker, and Hume.

Will Aldred Sep 18, 2023, 2:11 PM
3 points
0 ∶ 0
in reply to: Arepo’s comment on: Thresholds #1: What does good look like for longtermism?
The problem with this position is that the Black Hole Era—at least, the way the “Five Ages of the Universe” article you link to defines it—only starts after proton decay has run to (effective) completion,^[1] which means that all matter will be in black holes, which means that conscious beings will not exist to farm black holes for their energy. (If do, however, agree that life is in theory not dependent on luminous stars, and so life could continue beyond the Stelliferous Era and into the Degenerate Era, which adds many years.)
1. ^
  Whether proton decay will actually happen is still a major open question in physics. See, for example, Hadhazy (2021) or Siegel (2020).
  (Additionally, if proton decay does happen, there’s then the question of “could a technologically mature civilization stop proton decay?”. My money would be on “no”, but of course our current understanding of particle decay physics could be incorrect, or an advanced civilization might find an ingenious workaround.)

Will Aldred Sep 17, 2023, 8:01 PM
2 points
1 ∶ 3
on: Will Aldred’s Shortform
I just came across this old comment by Wei Dai which has aged well, for unfortunate reasons.
I think a healthy dose of moral uncertainty (and normative uncertainty in general) is really important to have, because it seems pretty easy for any ethical/social movement to become fanatical or to incur a radical element, and end up doing damage to itself, its members, or society at large. (“The road to hell is paved with good intentions” and all that.)

Will Aldred Sep 16, 2023, 1:34 PM
4 points
1 ∶ 0
in reply to: Zach Stein-Perlman’s comment on: Debate series: should we push for a pause on the development of AI?
(Adding to this: “FLOP” is the plural of “FLOP”.)

Will Aldred Sep 15, 2023, 11:09 AM
6 points
1 ∶ 0
on: Who should we interview for The 80,000 Hours Podcast?
Wei Dai on meta questions about metaphilosophy, the intellectual journey that led him there, and implications for AI safety.
What links here?
- Will Aldred's comment on AI doing philosophy = AI generating hands? by Wei Dai (Jan 15, 2024, 3:58 PM; 10 points)

Will Aldred Sep 14, 2023, 5:11 PM
10 points
2 ∶ 1
in reply to: Henry Stanley 🔸’s comment on: Who should we interview for The 80,000 Hours Podcast?
I second David Pearce, and I’d add digital sentience to the topic list. (Pearce appears to have a sophisticated view on consciousness, and his bottom line belief is that digital consciousness—at least, the type that would run on classical computers—is not possible.)

Will Aldred Sep 13, 2023, 7:32 PM
15 points
1 ∶ 0
in reply to: NunoSempere’s comment on: Who should we interview for The 80,000 Hours Podcast?
What do you propose you would talk about?

Will Aldred Sep 9, 2023, 4:40 PM
20 points
4 ∶ 0
in reply to: ElliotJDavies’s comment on: Sharing Information About Nonlinear
Just a note to say that we—the moderators—began looking into this two days ago. The current status is that Elliot has let the user who was allegedly asked to comment by Nonlinear know to reach out to us, if they are comfortable, so that we might determine whether what has happened here is an instance of brigading.
At this point, it seems worth mentioning that it is not the opinion of the moderators that every activity that looks like this is brigading. See the linked post for our full take on brigading: the summary is “your vote (or comment) should be your own”.

Will Aldred Aug 15, 2023, 5:09 PM
6 points
2 ∶ 0
in reply to: QubitSwarm99’s comment on: The 25 researchers who have published the largest number of academic articles on existential risk
Yes, especially given that impact of x-risk research is (very) heavy-tailed.

Will Aldred Aug 13, 2023, 11:50 PM
2 points
1 ∶ 0
in reply to: Thomas Larsen’s comment on: calebp’s Shortform
I agree with your “no lock-in” view in the case of alignment going well: in that world, we’d surely use the aligned superintelligence to help us with things like understanding AI sentience and making sure that sentient AIs aren’t suffering.
In the case of misalignment and humanity losing control of the future, I don’t think I understand the view that there wouldn’t be lock-in. I may well be missing something, but I can’t see why there wouldn’t be lock-in of things related to suffering risk—for example, whether or not the ASI creates sentient subroutines which help it achieve its goals but which incidentally suffer—that could in theory be steered away from even if we fail at alignment, given that the ASI’s future actions (even if they’re very hard to exactly predict) are decided by how we build it, and which we could likely steer away from more effectively if we better understood AI sentience (because then we’d know more about things like what kinds of subroutines can suffer).
What links here?
- Will Aldred's comment on Digital Minds Takeoff Scenarios by Bradford Saad (Jul 8, 2024, 1:25 PM; 9 points)

Will Aldred Aug 13, 2023, 9:06 PM
2 points
0 ∶ 0
in reply to: JP Addison🔸’s comment on: How come there isn’t that much focus in EA on research into whether / when AI’s are likely to be sentient?
Scenario 1: Alignment goes well. In this scenario, I agree that our future AI-assisted selves can figure things out, and that pre-alignment AI sentience work will have been wasted effort.
Scenario 2: Alignment goes poorly. While I don’t technically disagree with your statement, “If AIs are unaligned with human values, that seems very bad already,” I do think it misleads through lumping together all kinds of misaligned AI outcomes into “very bad,” when in reality this category ranges across many orders of magnitude of badness.^[1] In the case that we lose control of the future at some point, to me it seems worthwhile to try to steer away from some of the worse outcomes (e.g., astronomical “byproduct” suffering of digital minds, which is likely easier to avoid if we better understand AI sentience), before then.
1. ^
  From the roughly neutral outcome of paperclip maximization, to the extremely bad outcome of optimized suffering.
What links here?
- Will Aldred's comment on Digital Minds Takeoff Scenarios by Bradford Saad (Jul 8, 2024, 1:25 PM; 9 points)

Will Aldred Aug 13, 2023, 7:59 PM
3 points
0 ∶ 0
in reply to: Derek Shiller’s comment on: Google could build a conscious AI in three months
This is very informative to me, thanks for taking the time to reply. For what it’s worth, my exposure to theories of consciousness is from the neuroscience + cognitive science angle. (I very nearly started a PhD in IIT in Anil Seth’s lab back in 2020.) The overview of the field I had in my head could be crudely expressed as: higher-order theories and global workspace theories are ~dead (though, on the latter, Baars and co. have yet to give up); the exciting frontier research is in IIT and predictive processing and re-entry theories.
I’ve been puzzled by the mentions of GWT in EA circles—the noteworthy example here is how philosopher Rob Long gave GWT a fair amount of air time in his 80k episode. But given EA’s skew toward philosopher-types, this now makes a lot more sense.

Will Aldred Aug 13, 2023, 2:06 PM
2 points
0 ∶ 0
in reply to: EdoArad’s comment on: A retrospective on EA at ENS Paris, focused on obstacles.
Oh, yes, you’re right, I misremembered. Thanks for flagging, I’ve edited my above response.

Will Aldred Aug 13, 2023, 1:13 PM
7 points
0 ∶ 0
in reply to: Rebecca’s comment on: A retrospective on EA at ENS Paris, focused on obstacles.
Highly engaged EA.
As far as I’m aware, this was a term introduced by the CEA Groups team in around 2021, as a way of (proxy-)measuring the impact of different EA groups. The reasoning being that groups that turn more people into highly engaged EAs (HEAs) are more impactful. HEA corresponds to “Core” in CEA’s funnel model, I believe, so essentially an HEA is someone who has made a significant career decision based on EA principles.

Will Aldred Aug 12, 2023, 10:17 PM
10 points
3 ∶ 0
in reply to: Abby Babby’s comment on: Why doesn’t EA take enlightenment or awakening seriously?
This is not central to the original question (I agree with you that poverty and preventable diseases are more pressing concerns), but for what it’s worth, one shouldn’t be all that nonplussed at how the “insights” one might hear from “enlightened” people sound more like the sensation of insight than the discovery of new knowledge. Most people who’ve found something worthwhile in meditation—and I’m speaking here as an intermediate meditator who’s listened to many advanced meditators—would agree that progress/breakthroughs/the goal in meditation is not about gaining new knowledge, but rather, about seeing more clearly what is already here. (And doing so at an experiential level, not a conceptual level.)

Will Aldred Aug 11, 2023, 2:36 PM
2 points
0 ∶ 0
on: Google could build a conscious AI in three months
I don’t find IIT plausible, despite it’s popularity, and am not sure what effect it’s inclusion would have on the present arguments.
It sounds like you’re giving IIT approximately zero weight in your all-things-considered view. I find this surprising, given IIT’s popularity amongst people who’ve thought hard about consciousness, and given that you seem aware of this.
Additionally, I’d be interested to hear how your view may have updated in light of the recent empirical results from the IIT-GNWT adversarial collaboration:
In 2019, the Templeton Foundation announced funding in excess of $6,000,000 to test opposing empirical predictions of IIT and a rival theory (Global Neuronal Workspace Theory GNWT). The originators of both theories signed off on experimental protocols and data analyses as well as the exact conditions that satisfy if their championed theory correctly predicted the outcome or not. Initial results were revealed in June 2023. None of GNWT’s predictions passed what was agreed upon pre-registration while two out of three of IIT’s predictions passed that threshold.

Will Aldred Aug 10, 2023, 12:03 PM
29 points
9 ∶ 1
on: Update on cause area focus working group
There was near-consensus that Open Phil should generously fund promising AI safety community/movement-building projects they come across
Would you be able to say a bit about to what extent members of this working group have engaged with the arguments around AI safety movement-building potentially doing more harm than good? For instance, points 6 through 11 of Oli Habryka’s second message in the “Shutting Down the Lightcone Offices” post (link). If they have strong counterpoints to such arguments, then I imagine it would be valuable for these to be written up.
(Probably the strongest response I’ve seen to such arguments is the post “How MATS addresses ‘mass movement building’ concerns”. But this response is MATS-specific and doesn’t cover concerns around other forms of movement building, for example, ML upskilling bootcamps or AI safety courses operating through broad outreach.)
What links here?
- Will Aldred's comment on Managing risks while trying to do good by Wei Dai (Feb 2, 2024, 12:44 AM; 13 points)