I expected readers to assume that my wife owned significant equity in Anthropic; I’ve now edited the post to state this explicitly (and also added a mention of her OpenAI equity, which I should’ve included before and have included in the past). I don’t plan to disclose the exact amount and don’t think this is needed for readers to have sufficient context on my statements here.
Holden Karnofsky
Sorry, I didn’t mean to dismiss the importance of the conflict of interest or say it isn’t affecting my views.
I’ve sometimes seen people reason along the lines of “Since Holden is married to Daniela, this must mean he agrees with Anthropic on specific issue X,” or “Since Holden is married to Daniela, this must mean that he endorses taking a job at Anthropic in specific case Y.” I think this kind of reasoning is unreliable and has been incorrect in more than one specific case. That’s what I intended to push back against.
Thanks! I’m looking for case studies that will be public; I’m agnostic about where they’re posted beyond that. We might consider requests to fund confidential case studies, but this project is meant to inform broader efforts, so confidential case studies would still need to be cleared for sharing with a reasonable set of people, and the funding bar would be higher.
I think this was a goof due to there being a separate hardcover version, which has now been removed—try again?
To give a rough idea, I basically mean anyone who is likely to harm those around them (using a common-sense idea of doing harm) and/or “pollute the commons” by having an outsized and non-consultative negative impact on community dynamics. It’s debatable what the best warning signs are and how reliable they are.
Re: “In the weeks leading up to that April 2018 confrontation with Bankman-Fried and in the months that followed, Mac Aulay and others warned MacAskill, Beckstead and Karnofsky about her co-founder’s alleged duplicity and unscrupulous business ethics” -
I don’t remember Tara reaching out about this, and I just searched my email for signs of this and didn’t see any. I’m not confident this didn’t happen, just noting that I can’t remember or easily find signs of it.
In terms of what I knew/learned 2018 more generally, I discuss that here.
For context, my wife is the President and co-founder of Anthropic, and formerly worked at OpenAI.
80% of her equity in Anthropic is (not legally bindingly) pledged for donation. None of her equity in OpenAI is. She may pledge more in the future if there is a tangible compelling reason to do so.
I plan to be highly transparent about my conflict of interest, e.g. I regularly open meetings by disclosing it if I’m not sure the other person already knows about it, and I’ve often mentioned it when discussing related topics on Cold Takes.
I also plan to discuss the implications of my conflict of interest for any formal role I might take. It’s possible that my role in helping with safety standards will be limited to advising with no formal powers (it’s even possible that I’ll decide I simply can’t work in this area due to the conflict of interest, and will pursue one of the other interventions I’ve thought about).
But right now I’m just exploring options and giving non-authoritative advice, and that seems appropriate. (I’ll also note that I expect a lot of advice and opinions on standards to come from people who are directly employed by AI companies; while this does present a conflict of interest, and a more direct one than mine, I think it doesn’t and can’t mean they are excluded from relevant conversations.)
There was no one with official responsibility for the relationship between FTX and the EA community. I think the main reason the two were associated was via FTX’s/Sam having a high profile and talking a lot about EA—that’s not something anyone else was able to control. (Some folks did ask him to do less of this.)
It’s also worth noting that we generally try to be cautious about power dynamics as a funder, which means we are hesitant to be pushy about most matters. In particular, I think one of two major funders in this space attacking the other, nudging grantees to avoid association and funding from it, etc. would’ve been seen as strangely territorial behavior absent very strong evidence of misconduct.
That said: as mentioned in another comment, with the benefit of hindsight, I wish I’d reasoned more like this: “This person is becoming very associated with effective altruism, so whether or not that’s due to anything I’ve done, it’s important to figure out whether that’s a bad thing and whether proactive distancing is needed.”
In 2018, I heard accusations that Sam had communicated in ways that left people confused or misled, though often with some ambiguity about whether Sam had been confused himself, had been inadvertently misleading while factually accurate, etc. I put some effort into understanding these concerns (but didn’t spend a ton of time on it; Open Phil didn’t have a relationship with Sam or Alameda).
I didn’t hear anything that sounded anywhere near as bad as what has since come out about his behavior at FTX. At the time I didn’t feel my concerns rose to the level where it would be appropriate or fair to publicly attack or condemn him. The whole situation did make me vaguely nervous, and I spoke with some people about it privately, but I never came to a conclusion that there was a clearly warranted (public) action.
I don’t believe #1 is correct. The Open Philanthropy grant is a small fraction of the funding OpenAI has received, and I don’t think it was crucial for OpenAI at any point.
I think #2 is fair insofar as running a scaling lab poses big risks to the world. I hope that OpenAI will avoid training or deploying directly dangerous systems; I think that even the deployments it’s done so far pose risks via hype and acceleration. (Considering the latter a risk to society is an unusual standard to hold a company to, but I think it’s appropriate here.)
#3 seems off to me—“regulatory capture” does not describe what’s at the link you gave (where’s the regulator?) At best it seems like a strained analogy, and even there it doesn’t seem right to me—I don’t know of any sense in which I or anyone else was “captured” by OpenAI.
I can’t comment on #4.
#5 seems off to me. I don’t know whether OpenAI uses nondisparagement agreements; I haven’t signed one. The reason I am careful with public statements about OpenAI is (a) it seems generally unproductive for me to talk carelessly in public about important organizations (likely to cause drama and drain the time and energy of me and others); (b) I am bound by confidentiality requirements, which are not the same as nondisparagement requirements. Information I have access to via having been on the board, or via being married to a former employee, is not mine to freely share.
Just noting that many of the “this concept is properly explained elsewhere” links are also accompanied by expandable boxes that you can click to expand for the gist. I do think that understanding where I’m coming from in this piece requires a bunch of background, but I’ve tried to make it as easy on readers as I could, e.g. explaining each concept in brief and providing a link if the brief explanation isn’t clear enough or doesn’t address particular objections.
Noting that I’m now going back through posts responding to comments, after putting off doing so for months—I generally find it easier to do this in bulk to avoid being distracted from my core priorities, though this time I think I put it off longer than I should’ve.
It is generally true that my participation in comments is extremely sporadic/sparse, and folks should factor that into curation decisions.
I wouldn’t say I’m in “sprinting” mode—I don’t expect my work hours to go up (and I generally work less than I did a few years ago, basically because I’m a dad now).
The move is partly about AI timelines, partly about the opportunities I see and partly about Open Philanthropy’s stage of development.
I threw that book together for people who want to read it on Kindle, but it’s quite half-baked. If I had the time, I’d want to rework the series (and a more recent followup series at https://www.cold-takes.com/tag/implicationsofmostimportantcentury/) into a proper book, but I’m not sure when or whether I’ll do this.
I expect more funding discontinuations than usual, but we generally try to discontinue funding in a way that gives organizations time to plan around the change.
I’m not leading the longer-term process. I expect Open Philanthropy will publish content about it, but I’m not sure when.
I don’t have a good answer, sorry. The difficulty of getting cardinal estimates for longtermist grants is a lot of what drove our decision to go with an ordinal approach instead.
Aiming to spend down in less than 20 years would not obviously be justified even if one’s median for transformative AI timelines were well under 20 years. This is because we may want extra capital in a “crunch time” where we’re close enough to transformative AI for the strategic picture to have become a lot clearer, and because even a 10-25% chance of longer timelines would provide some justification for not spending down on short time frames.
This move could be justified if the existing giving opportunities were strong enough even with a lower bar. That may end up being the case in the future. But we don’t feel it’s the case today, having eyeballed the stack rank.
Here’s a followup with some reflections.
Note that I discuss some takeaways and potential lessons learned in this interview.
Here are some (somewhat redundant with the interview) things I feel like I’ve updated on in light of the FTX collapse and aftermath:
The most obvious thing that’s changed is a tighter funding situation, which I addressed here.
I’m generally more concerned about the dynamics I wrote about in EA is about maximization, and maximization is perilous. If I wrote that piece today, most of it would be the same, but the “Avoiding the pitfalls” section would be quite different (less reassuring/reassured). I’m not really sure what to do about these dynamics, i.e., how to reduce the risk that EA will encourage and attract perilous maximization, but a couple of possibilities:
It looks to me like the community needs to beef up and improve investments in activities like “identifying and warning about bad actors in the community,” and I regret not taking a stronger hand in doing so to date. (Recent sexual harassment developments reinforce this point.).
I’ve long wanted to try to write up a detailed intellectual case against what one might call “hard-core utilitarianism.” I think arguing about this sort of thing on the merits is probably the most promising way to reduce associated risks; EA isn’t (and I don’t want it to be) the kind of community where you can change what people operationally value just by saying you want it to change, and I think the intellectual case has to be made. I think there is a good substantive case for pluralism and moderation that could be better-explained and easier to find, and I’m thinking about how to make that happen (though I can’t promise to do so soon).
I had some concerns about SBF and FTX, but I largely thought of the situation as not being my responsibility, as Open Philanthropy had no formal relationship to either. In hindsight, I wish I’d reasoned more like this: “This person is becoming very associated with effective altruism, so whether or not that’s due to anything I’ve done, it’s important to figure out whether that’s a bad thing and whether proactive distancing is needed.”
I’m not surprised there are some bad actors in the EA community (I think bad actors exist in any community), but I’ve increased my picture of how much harm a small set of them can do, and hence I think it could be good for Open Philanthropy to become more conservative about funding and associating with people who might end up being bad actors (while recognizing that it won’t be able to predict perfectly on this front).
Prior to the FTX collapse, I had been gradually updating toward feeling like Open Philanthropy should be less cautious with funding and other actions; quicker to trust our own intuitions and people who intuitively seemed to share our values; and generally less cautious. Some of this update was based on thinking that some folks associated with FTX were being successful with more self-trusting, less-cautious attitudes; some of it was based on seeing few immediate negative consequences of things like the Future Fund regranting program; some of it was probably a less rational response to peer pressure. I now feel the case for caution and deliberation in most actions is quite strong—partly because the substantive situation has changed (effective altruism is now enough in the spotlight, and controversial enough, that the costs of further problems seem higher than they did before).
On this front, I’ve updated a bit toward my previous self, and more so toward Alexander’s style, in terms of wanting to weigh both explicit risks and vague misgivings significantly before taking notable actions. That said, I think balance is needed and this is only a fairly moderate update, partly because I didn’t update enormously in the other direction before. I think I’m still overall more in favor of moving quickly than I was ~5 years ago, for a number of reasons. In any case I don’t expect there to be a dramatic visible change on this front in terms of Open Philanthropy’s grantmaking, though it might be investing more effort in improving functions like community health.
Having seen the EA brand under the spotlight, I now think it isn’t a great brand for wide public outreach. It throws together a lot of very different things (global health giving, global catastrophic risk reduction, longtermism) in a way that makes sense to me but seems highly confusing to many, and puts them all under a wrapper that seems self-righteous and, for lack of a better term, punchable? I still think of myself as an effective altruist and think we should continue to have an EA brand for attracting the sort of people (like myself) who want to put a lot of dedicated, intensive time into thinking about what issues they can work on to do the most good; but I’m not sure this is the brand that will or should attract most of the people who can be helpful on key causes. I think it’s probably good to focus more on building communities and professional networks around specific causes (e.g., AI risk, biorisk, animal welfare, global health) relative to building them around “EA.”
I think we should see “EA community building” as less valuable than before, if only because one of the biggest seeming success stories now seems to be a harm story. I think this concern applies to community building for specific issues as well. It’s hard to make a clean quantitative statement about how this will change Open Philanthropy’s actions, but it’s a factor in how we recently ranked grants. I think it’ll be important to do quite a bit more thinking about this (and in particular, to gather more data along these lines) in the longer run.
- Holden Karnofsky’s recent comments on FTX by 24 Mar 2023 11:44 UTC; 149 points) (
- EA might systematically generate a scarcity mindset that produces low-integrity actors by 25 Apr 2023 15:49 UTC; 53 points) (
- EA might systematically generate a scarcity mindset that produces low-integrity actors by 25 Apr 2023 15:50 UTC; 26 points) (LessWrong;
My point with the observation you quoted wasn’t “This would be unprecedented, therefore there’s a very low prior probability.” It was more like: “It’s very hard to justify >90% confidence on anything without some strong base rate to go off of. In this case, we have no base rate to go off of; we’re pretty wildly guessing.” I agree something weird has to happen fairly “soon” by zoomed-out historical standards, but there are many possible candidates for what the weird thing is (I also endorse dsj’s comment below).
Digital content requires physical space too, just relatively small amounts. E.g., physical resources/atoms are needed to make the calculations associated with digital interactions. At some point the number of digital interactions will be capped, and the question will be how much they can be made better and better. More on the latter here: https://www.cold-takes.com/more-on-multiple-world-size-economies-per-atom/