AI safety governance/strategy research & field-building.
Formerly a PhD student in clinical psychology @ UPenn, college student at Harvard, and summer research fellow at the Happier Lives Institute.
AI safety governance/strategy research & field-building.
Formerly a PhD student in clinical psychology @ UPenn, college student at Harvard, and summer research fellow at the Happier Lives Institute.
Very excited to see this section (from the condensed report). Are you able to say more about the kind of work you would find useful in this space or the organizations/individuals that you think are doing some exemplary work in this space?
We recommend interventions which plan for worst-case scenarios—that is, interventions which are effective when preventative measures fail to prevent AI threats emerging. For concreteness, we outline some potential interventions which boost resilience against AI risks.
Developing contingency plans: Ensure there are clear plans and protocols in the event that an AI system poses an unacceptably high level of risk.13 Such planning could be analogous to planning in other fields, such as pandemic preparedness or nuclear wargaming.
Robust shutdown mechanisms: Invest in infrastructure and planning to make it easier to close down AI systems in scenarios where they pose unacceptably high levels of risk.
Also, very minor, but I think there’s a minor formatting issue with footnote 23.
To me, it sounds like you’re saying, ‘Bob is developing a more healthy relationship with EA’.
Oh just a quick clarification– I wasn’t trying to say anything about Bob or Bob’s relationship with EA here.
I just wanted to chime in with my own experience (which is not the same as Bob’s but shares one similarity in that they’re both in the “rethinking one’s relationship with the EA community/movement” umbrella).
More generally, I suspect many forum readers are grappling with this question of “what do I want my relationship with the EA community/movement to be”. Given this, it might be useful for more people to share how they’ve processed these questions (whether they’re related to the recent Manifold events or related to other things that have caused people to question their affiliation with EA).
Thanks for sharing your experience here. I’m glad you see a path forward that involves continuing to work on issues you care about despite distancing yourself from the community.
In general, I think people should be more willing to accept that you can accept EA ideas or pursue EA-inspired careers without necessarily accepting the EA community. I sometimes hear people struggling with the fact that they like a lot of the values/beliefs in EA (e.g., desire to use evidence and reason to find cost-effective and time-effective ways of improving the world) while having a lot of concerns about the modern EA movement/community.
The main thing I tell these folks is that you can live by certain EA principles while distancing yourself from the community. I’ve known several people who have distanced themselves from the community (for various reasons, not just the ones listed here) but remained in AI safety or other topics they care about.
Personally, I feel like I’ve benefitted quite a bit from being less centrally involved in the EA space (and correspondingly being more involved in other professional/social spaces). I think this comment by Habryka describes a lot of the psychological/intellectual effects that I experienced.
Relatedly, as I specialized more in AI safety, I found it useful to ask questions like “what spaces should I go to where I can meet people who could help with my AI safety goals”. This sometimes overlapped with “go to EA event” but often overlapped with “go meet people outside the EA community who are doing relevant work or have relevant experience”, and I think this has been a very valuable part of my professional growth over the last 1-2 years.
oops yup— was conflating and my comment makes less sense once the conflation goes away. good catch!
To clarify, I did see the invitations to other funders. However, my perception was that those are invitations to find people to hand things off to, rather than to be a continuing partner like with GV.
This was also my impression. To the extent that the reason why OP doesn’t want to fund something is because of PR risks & energy/time/attention costs, it’s a bit surprising that OP would partner with another group to fund something.
Perhaps the idea here is that the PR/energy/time/attention costs would be split between orgs? And that this would outweigh the costs of coordinating with another group?
Or it’s just that OP feels better if OP doesn’t have to spend its own money on something? Perhaps because of funding constraints?
I’m also a bit confused about scenarios where OP wouldn’t fund X for PR reasons but would want some other EA group to fund X. It seems to me like the PR attacks against the EA movement would be just as strong– perhaps OP as an institution could distance itself, but from an altruistic standpoint that wouldn’t matter much. (I do see how OP would want to not fund something for energy/capacity reasons but then be OK with some other funder focusing on that space.)
In general, I feel like communication from OP could have been clearer in a lot of the comments. Or OP could’ve done a “meta thing” just making it explicit that they don’t currently want to share more details.
In the post above we (twice!) invited outreach from other funders
But EG phrasing like this makes me wonder if OP believes it’s communicating clearly and is genuinely baffled when commentators have (what I see as quite reasonable) misunderstandings or confusions.
I read the part after “or” as extending the frame beyond reputation risks, and I was pleased to see that and chose to engage with it.
Ah, gotcha. This makes sense– thanks for the clarification.
If you look at my comments here and in my post, I’ve elaborated on other issues quite a few times and people keep ignoring those comments and projecting “PR risk” on to everything
I’ve looked over the comments here a few times, and I suspect you might think you’re coming off more clearly than you actually are. It’s plausible to me that since you have all the context of your decision-making, you don’t see when you’re saying things that would genuinely confuse others.
For example, even in statement you affirmed, I see how if one is paying attention to the “or”, one could see you technically only/primarily endorsing the non-PR part of the phrase.
But in general, I think it’s pretty reasonable and expected that people ended up focusing on the PR part.
More broadly, I think some of your statements have been kind of short and able to be interpreted in many ways. EG, I don’t get a clear sense of what you mean by this:
It’s not just “lower risk” but more shared responsibility and energy to engage with decision making, persuading, defending, etc.
I think it’s reasonable for you to stop engaging here. Communication is hard and costly, misinterpretations are common and drain energy, etc. Just noting that– from my POV– this is less of a case of “people were interpreting you uncharitably” and more of a case of “it was/is genuinely kind of hard to tell what you believe, and I suspect people are mostly engaging in good faith here.”
The attitude in EA communities is “give an inch, fight a mile”. So I’ll choose to be less legible instead.
As a datapoint (which you can completely ignore), I feel like in the circles I travel in, I’ve heard a lot more criticism of OP that look more like “shady non-transparent group that makes huge decisions/mistakes without consulting anyone except a few Trusted People who all share the same opinions.”
There are certainly some cases in which the attack surface is increased when you’re fully open/transparent about reasoning.
But I do think it can be easy to underestimate the amount of reputational damage that OP (and you, by extension) take from being less legible/transparent. I think there’s a serious risk that many subgroups in EA will continue to feel more critical of OP as it becomes more clear that OP is not interested in explaining its reasoning to the broader community, becomes more insular, etc. I also suspect this will have a meaningful effect on how OP is perceived in non-EA circles. I don’t mean e/accs being like “OP are evil doomers who want to give our future to China”– I mean neutral third-parties who dispassionately try to form an impression of OP. When they encounter arguments like “well OP is just another shady billionaire-funded thing that is beholden to a very small group of people who end up deciding things in non-transparent and illegible ways, and those decisions sometimes produce pretty large-scale failures”, I expect that they will find these concerns pretty credible.
Caveating that not all of these concerns would go away with more transparency and that I do generally buy that more transparency will (in some cases) lead to a net increase on the attack surface. The tradeoffs here seem quite difficult.
But my own opinion is that OP has shifted too far in the “worry a lot about PR in the conventional sense” direction in ways that have not only led to less funding for important projects but also led to a corresponding reduction in reputation/status/prestige, both within and outside of EA circles.
@Dustin Moskovitz I think some of the confusion is resulting from this:
Your second statement is basically right, though my personal view is they impose costs on the movement/EA brand and not just us personally. Digital minds work, for example, primes the idea that our AI safety concerns are focused on consciousness-driven catalysts (“Terminator scenarios”), when in reality that is just one of a wide variety of ways AI can result in catastrophe.
In my reading of the thread, you first said “yeah, basically I think a lot of these funding changes are based on reputational risk to me and to the broader EA movement.”
Then, people started challenging things like “how much should reputational risk to the EA movement matter and what really are the second-order effects of things like digital minds research.”
Then, I was expecting you to just say something like “yeah, we probably disagree on the importance of reputation and second-order effects.”
But instead, it feels (to me) like you kind of backtracked and said “no actually, it’s not really about reputation. It’s more about limited capacity– we have finite energy, attention, stress, etc. Also shared responsibility.”
It’s plausible that I’m misunderstanding something, but it felt (at least to me) like your earlier message made it seem like PR/reputation was the central factor and your later messages made it seem like it’s more about limited capacity/energy. These feel like two pretty different rationales, so it might be helpful for you to clarify which one is more influential (or present a clearer synthesis of the two rationales).
(Also, I don’t think you necessarily owe the EAF an explanation– it’s your money etc etc.)
The field is not ready, and it’s not going to suddenly become ready tomorrow. We need urgent and decisive action, but to indefinitely globally halt progress toward this technology that threatens our lives and our children’s lives, not to accelerate ourselves straight off a cliff.
I think most advocacy around international coordination (that I’ve seen, at least) has this sort of vibe to it. The claim is “unless we can make this work, everyone will die.”
I think this is an important point to be raising– and in particular I think that efforts to raise awareness about misalignment + loss of control failure modes would be very useful. Many policymakers have only or primarily heard about misuse risks and CBRN threats, and the “policymaker prior” is usually to think “if there is a dangerous, tech the most important thing to do is to make the US gets it first.”
But in addition to this, I’d like to see more “international coordination advocates” come up with concrete proposals for what international coordination would actually look like. If the USG “wakes up”, I think we will very quickly see that a lot of policymakers + natsec folks will be willing to entertain ambitious proposals.
By default, I expect a lot of people will agree that international coordination in principle would be safer but they will fear that in practice it is not going to work. As a rough analogy, I don’t think most serious natsec people were like “yes, of course the thing we should do is enter into an arms race with the Soviet Union. This is the safeest thing for humanity.”
Rather, I think it was much more a vibe of “it would be ideal if we could all avoid an arms race, but there’s no way we can trust the Soviets to follow-through on this.” (In addition to stuff that’s more vibesy and less rational than this, but I do think insofar as logic and explicit reasoning were influential, this was likely one of the core cruses.)
In my opinion, one of the most important products for “international coordination advocates” to produce is some sort of concrete plan for The International Project. And importantly, it would need to somehow find institutional designs and governance mechanisms that would appeal to both the US and China. Answering questions like “how do the international institutions work”, “who runs them”, “how are they financed”, and “what happens if the US and China disagree” will be essential here.
The Baruch Plan and the Acheson-Lilienthal Report (see full report here) might be useful sources of inspiration.
P.S. I might personally spend some time on this and find others who might be interested. Feel free to reach out if you’re interested and feel like you have the skillset for this kind of thing.
Potentially Pavel Izmailov– not sure if he is related to the EA community and not sure the exact details of why he was fired.
https://www.maginative.com/article/openai-fires-two-researchers-for-alleged-leaking/
Thanks! Familiar with the post— another way of framing my question is “has Holden changed his mind about anything in the last several months? Now that we’ve had more time to see how governments and labs are responding, what are his updated views/priorities?”
(The post, while helpful, is 6 months old, and I feel like the last several months has given us a lot more info about the world than we had back when RSPs were initially being formed/released.)
Congratulations on the new role– I agree that engaging with people outside of existing AI risk networks has a lot of potential for impact.
Besides RSPs, can you give any additional examples of approaches that you’re excited about from the perspective of building a bigger tent & appealing beyond AI risk communities? This balancing act of “find ideas that resonate with broader audiences” and “find ideas that actually reduce risk and don’t merely serve as applause lights or safety washing” seems quite important. I’d be interested in hearing if you have any concrete ideas that you think strike a good balance of this, as well as any high-level advice for how to navigate this.
Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments (you can’t do X or you can’t do X unless Y), preparedness from governments (you can keep doing X but if we see Y then we’re going to do Z), or other governance mechanisms?
(I’ll note I ask these partially as someone who has been pretty disappointed in the ultimate output from RSPs, though there’s no need to rehash that debate here– I am quite curious for how you’re reasoning through these questions despite some likely differences in how we think about the success of previous efforts like RSPs.)
What are some of your favorite examples of their effectiveness?
Congrats to Zach! I feel like this is mostly supposed to be a “quick update/celebratory post”, but I feel like there’s a missing mood that I want to convey in this comment. Note that my thoughts mostly come from an AI Safety perspective, so these thoughts may be less relevant for folks who focus on other cause areas.
My impression is that EA is currently facing an unprecedented about of PR backlash, as well as some solid internal criticisms among core EAs who are now distancing from EA. I suspect this will likely continue into 2024. Some examples:
EA has acquired several external enemies as a result of the OpenAI coup. I suspect that investors/accelerationists will be looking for ways to (further) damage EA’s reputation.
EA is acquiring external enemies as a result of its political engagements. There have been a few news articles recently criticizing EA-affiliated or EA-influenced fellowship programs and think-tanks.
EA is acquiring an increasing number of internal critics. Informally, I feel like many people I know (myself included) have become increasingly dissatisfied with the “modern EA movement” and “mainstream EA institutions”. Examples of common criticisms include “low integrity/low openness”, “low willingness to critique powerful EA institutions”, “low willingness to take actions in the world that advocate directly/openly for beliefs”, “cozyness with AI labs”, “general slowness/inaction bias”, and “lack of willingness to support groups pushing for concrete policies to curb the AI race.” (I’ll acknowledge that some of these are more controversial than others and could reflect genuine worldview differences, though even so, my impression is that they’re meaningfully contributing to a schism in ways that go beyond typical worldview differences).
I’d be curious to know how CEA is reacting to this. The answer might be “well, we don’t really focus much on AI safety, so we don’t really see this as our thing to respond to.” The answer might be “we think these criticisms are unfair/low-quality, so we’re going to ignore them.” Or the answer might be “we take X criticism super seriously and are planning to do Y about it.”
Regardless, I suspect that this is an especially important and challenging time to be the CEO of CEA. I hope Zach (and others at CEA) are able to navigate the increasing public scrutiny & internal scrutiny of EA that I suspect will continue into 2024.
This quick take seems relevant: https://forum.effectivealtruism.org/posts/auAYMTcwLQxh2jB6Z/zach-stein-perlman-s-quick-takes?commentId=HiZ8GDQBNogbHo8X8