I’m Callum—I founded the ARENA program for technical AI safety upskilling, I’ve also worked on various open source mech interp things. I was in Anthropic’s interp team for ~6 months (working on some of their recent papers) and I’m currently doing interp at DeepMind. I like mathematical art, running, climbing, and most importantly I hold a Guinness World Record for the largest number of clothespegs held in one hand at once (yep, really).
Callum McDougall
Great! (-:
Join ASAP (AI Safety Accountability Programme)
Thanks for the comment! I first want to register strong agreement with many of your points, e.g. the root of the problem isn’t necessarily technology inherently, but rather our inability to do things like coordinate well and think in a long-term way. I also think that focusing too much on individual risks while avoiding the larger picture is a failure mode that some in the community fall into, and Ord’s book might have done well to spend some time taking this perspective (he does talk about risk factors which is part of the way to a more systemic perspective, but he doesn’t really address the fundamental drivers of many of these risks, which I agree seems like a missed opportunity).
That being said, I think I have a few main disagreements here:
Lack of good opportunities for more general longtermist interventions. I think if there were really promising avenues for advancing along the frontiers you suggest (e.g. trying to encourage cultural philosophical perspective shifts, if I’m understanding your point here correctly) then I’d probably change my mind here. But it still seems imo like these kinds of interventions aren’t as promising as direct work on individual risks, which is still super neglected in cases like bio/AI.
Work on individual risks does (at least partially) generalise. For instance, in the case of work on specific future risks e.g. bio and AI, it doesn’t seem like we can draw useful lessons about what kinds of strategies work (e.g. regulation/slowing research, better public materials and education about the risks, integrating more with the academic community) unless we actually try out these strategies.
Addressing some risks might directly reduce others. For instance, getting AI alignment right would probably be a massive boon for our ability to handle other natural risks. This is pretty speculative though, because we don’t really know what a future where we get AI right looks like.
Yeah +1 to Nandini’s point, I think we should have been made this clearer in the post. I think people have a lot of misconceptions about EA (e.g. lots of people just think EA is about effective charitable giving), and we wanted to emphasise this particular part rather than trying to construct the whole tower of assumptions.
That being said, I do think that the abundance of writing from Ord/Bostrom is something that we could have done a better job of toning down, and different perspectives could have been included. If you have any specific recommendations for reading material you think would positively contribute in any week (or reading material already in the course that you think could be removed), we’d be really grateful!
That’s something we’ve definitely considered, but the idea is for this course to be marketed mainly via CERI, and since they already have existential risks in their name plus define it in a lot of their promo material, we felt like it would probably be more appropriate to stick with that terminology.
Introducing the Existential Risks Introductory Course (ERIC)
Announcing the Distillation for Alignment Practicum (DAP)
“Supercluster”, “Triangulum”, or “Centaurus”?
Keeping with the astronomial theme of “Lightcone” and “Constellation”, while also sounding like they could be the names for gatherings of people
That’s a good point! Although I guess one reply you could have to this is that we shouldn’t expect paradigm shifts to slow down, and indeed I think most of Yudkowsky’s probability mass is on something like “there is a paradigm shift in AI which rapidly unlocks the capabilities for general intellgence”, rather than e.g. continuous scaling from current systems.
Thanks, I really appreciate your comment!
And yep I agree Yudkowsky doesn’t seem to be saying this, because it doesn’t really represent a phase change of positive feedback cycles of intelligence, which is what he expects to happen in a hard takeoff.
I think more of the actual mathematical models he uses when discussing takeoff speeds can be found in his Intelligence Explosion Microeconomics paper. I haven’t read it in detail, but my general impression of this paper (and how it’s seen by others in the field) is that it successfully manages to make strong statements about the nature of intelligence and what it implies for takeoff speeds without relying on reference classes, but that it’s (a) not particularly accessible, and (b) not very in-touch with the modern deep learning paradigm (largely because of an over-reliance on the concept of recursive self-improvement, that now doesn’t seem like it will pan out the way it was originally expected to).
Ah yep, I’d been planning to do that but had forgotten, will do now. Thanks!
MIRI Conversations: Technology Forecasting & Gradualism (Distillation)
Update on the project board thing—I’m assuming that was referring to this website, which looks really awesome!
Skilling-up in ML Engineering for Alignment: request for comments
Hey Pablo! These seem really interesting, I love the implementation of music with Anki (even if they haven’t all been successes). Adding audio files was one thing I forgot to mention in the post, I did it when I was learning Chinese and it was pretty useful (although I stopped learning Chinese pretty soon after I started adding them, so they never really caught on—I suspect it would have been too much work to mass-produce).
I like the keyboard shortcuts one! It would be great if Anki had a way of testing whether you’d typed out the right keys, although I agree that without this it seems hard to implement.
And thanks for describing how you create Anki cards for textbooks—it’s always interesting to hear other peoples’ actual card creation process, i.e. how you decide which bits of knowledge to Ankify in whatever you’re reading, rather than just the mechanics (the former is something I didn’t really go into in the post).
Ah thanks, that looks awesome! Will definitely suggest this in the group