GCR capacity-building grantmaking and projects at Open Phil.
Eli Roseđ¸
thank machine doggo
Oh woops didnât look at parent comment, haah
I think the vast majority of people making decisions about public policy or who to vote for either arenât ethically impartial, or theyâre âspotlightingâ, as you put it. I expect the kind of bracketing Iâd endorse upon reflection to look pretty different from such decision-making.
But suppose I want to know who of two candidates to vote for, and Iâd like to incorporate impartial ethics into that decision. What do I do then?
That said, maybe youâre thinking of this point I mentioned to you on a call
Hmm, I donât recall this; another Eli perhaps? : )
(vibesy post)
People often want to be part of something bigger than themselves. At least for a lot of people this is pre-theoretic. Personally, Iâve felt this since I was little: to spend my whole life satisfying the particular desires of the particular person I happened to be born into the body of, seemed pointless and uninteresting.
I knew I wanted âsomething biggerâ even when I was young (e.g. 13 years old). Around this age my dream was to be a novelist. This isnât a kind of desire people would generally call âaltruistic,â nor would my younger self have called it âaltruistic.â But it was certainly grounded in a desire for my life to mean something to other people. Stuff like the Discworld series and Watchmen really meant something to me, and I wanted to write stuff that meant something to others in the same way.
My current dreams and worldview, after ~10 years of escalating involvement with EA, seem to me to spring from the same seed. I feel continuous with my much younger self. I want my life to mean something to others: that is the obvious yardstick. I want to be doing the most I can on that front.
The empirics were the surprising part. It turns out that the âbasic shape of the worldâ is much more mutable than my younger self thought, and in light of this my earlier dreams seem extremely unambitious. Astonishingly, I can probably:
save many lives over my career at minimum, by donating to GiveWell, and likely more by doing more off the beaten path things
save <large number> of e.g. chickens from lives full of torture
be part of a pretty small set of people seriously trying to do something about truly wild risks from new AI and bioengineering technologies
It probably matters more to others that they are not tortured, or dying of malaria, or suffering some kind of AI catastrophe, than that there is another good book for them to read, especially given there are already a lot of good novelists. The seed of the impulse is the same â wanting to be part of something bigger, wanting to live for my effect on others and not just myself. My sense of what is truly out there in the world and of what I can do about it are whatâs changed.
Like if youâre contemplating running a fellowship program for AI interested people, and you have animals in your moral circle, youâre going to have to build this botec that includes the probability an X% of the people you bring into the fellowship are not going to care about animals and likely, if they get a policy role, to pass policies that are really bad for them...
...I sort of suspect that only a handful of people are trying to do this, and I get why! I made a reasonably straightforward botec for calculating the benefits to birds of bird-safe glass, that accounted for backfire to birds, and it took a lot of research effort. If you asked me how bird-safe glass policy is going to affect AI risk after all that, I might throw my computer at you. But I think the precise probabilities approach would imply that I should.
Just purely on the descriptive level and not the normative one â
I agree but even more strongly: in AI safety Iâve basically never seen a BOTEC this detailed. I think Eric Neymanâs BOTEC of the cost-effectiveness of donating to congressional candidate Alex Bores is a good public example of the type of analysis common in EA-driven AI safety work: it bottoms out in pretty general goods like âgovernment action on AI safetyâ and does not try to model second-order effects to the degree described here. It doesnât model even considerations like âwhat if AI safety legislation is passed, but that legislation backfires by increasing polarization on the issue?â let alone anything about animals.
Instead, this kind of strategic discussion tends to be qualitative, and is hashed out in huge blocks of prose and comment threads e.g. on LessWrong, or verbally.
I sort of wonder if some people in the AI communityâany maybe you, from what youâve said here? -- are using precise probabilities to get to the conclusion that you want to work primarily on AI stuff, and then spotlighting to that cause area when youâre analyzing at the level of interventions.
I see why you describe it this way, and this directionally this seems right. But, what we do doesnât really sound like âspotlightingâ as you describe it in the post: focusing on specific moral patient groups and explicitly setting aside others.
Essentially I think the epistemic framework we use is just more anarchic and freeform than that! In AIS discourse, it feels like âbut this intervention could slow down the US relative to Chinaâ or âbut this intervention could backfire by increasing polarizationâ or âbut this intervention could be bad for animalsâ exist at the same epistemic level, and all are considered valid points to raise.
(I do think that there is a significant body of orthodox AI safety thought which takes particular stances on each of these issues and other issues, which in a lot of contexts likely makes various points feel like not âvalidâ to raise. I think this is unfortunate.)
Maybe itâs similar to the difference between philosophy and experimental science, where in philosophy a lot of discourse is fundamentally unstructured and qualitative, and in the experimental sciences there is much more structure because any contribution needs to be an empirical experiment, and there are specific norms and formats for those, which have certain implications for how second-order effects are or arenât considered. AI safety discourse also feels similar at times to wonk-ish policy discourse.
(Within certain well-scoped sub-areas of AI safety things are less epistemically anarchic; e.g. research into AI interpretability usually needs empirical results if itâs to be taken seriously.)
I think someone using precise probabilities all the way down is building a lot more explicit models every time they consider a specific intervention. Like if youâre contemplating running a fellowship program for AI interested people, and you have animals in your moral circle, youâre going to have to build this botec that includes the probability an X% of the people you bring into the fellowship are not going to care about animals and likely, if they get a policy role, to pass policies that are really bad for them. And all sorts of things like that. So your output would be a bunch of hypotheses about exactly how these fellows are going to benefit AI policy, and some precise probabilities about how those policy benefits are going to help people, and possibly animals to what degree, etc.
Hmm, I wouldnât agree that someone using precise probabilities âall the way downâ is necessarily building these kind of explicit models. I wonder if the term âprecise probabilitiesâ is being understood differently in our two areas.
In the Bayesian epistemic style that EA x AI safety has, itâs felt that anyone can attach precise probabilities to their beliefs with ~no additional thought, and that these probabilities are subjective things which may not be backed by any kind of explicit or even externally legible model. Thereâs a huge focus on probabilities as betting odds, and betting odds donât require such things (diverging notably from how probabilities are used in science).
I mean, I think typically people have something to say to justify their beliefs, but this can be & often is something as high-level as âit seems good if AGI companies are required to be more transparent about their safety practices,â with little in the way of explicit models about downstream effects thereof.[1]
Apologies for not responding to some of the other threads in your post, ran out of time; looking forward to discussing in person sometime.
- ^
While itâs common for AI safety people to agree with my statement about transparency here, some may flatly disagree (i.e. disagree about sign), and others (more commonly) may disagree massively about the magnitude of the effect. There are many verbal arguments but relatively few explicit models to adjudicate these disputes.
- ^
I just remembered Matthew Barnettâs 2022 post My Current Thoughts on the risks from SETI which is a serious investigation into how to mitigate this exact scenario.
That does seem right, thanks. I intended to include dictator-ish human takeover there (which seems to me to be at least as likely as misaligned AI takeover) as well, but didnât say that clearly.
Edited to ârelatively amoral forcesâ which still isnât great but maybe a little clearer.
Enjoyed this post.
Maybe Iâll speak from an AI safety perspective. The usual argument among EAs working on AI safety is:
the future is large and plausibly contains much goodness
today, we can plausibly do things to steer (in expectation) towards achieving this goodness and away from catastrophically losing it
the invention of powerful AI is a super important leverage point for such steering
This is also the main argument motivating me â though I retain meaningful meta-uncertainty and am also interested in more commonsense motivations for AI safety work.
A lot of the potential goodness in 1. seems to come from digital minds that humans create, since it seems that at some point these will be much quicker to replicate than humans or animals. But lots of the interventions in 2. seem to also be helpful for getting things to go better for current farmed and wild animals, e.g. because they are aimed avoiding a takeover of society by forces which donât care at all about morals. Personally I hope we use technology to lift wild animals out of their current predicament, although I have little idea what it would look like with any concreteness.
This relies on what you call the âassigning precise probabilitiesâ approach, and indeed I rarely encounter AI safety x EA people who arenât happy assigning precise probabilities, even in the face of deep uncertainty. I really like how your post points out that this is a difference from the discourse around wild animal welfare but that itâs not clear what the high-level reason for this is. I donât see a clear high-level reason either from my vantage point. Some thoughts:
It might be interesting to move out of high-level reason zone entirely and just look at the interventions, e.g. directly compare the robustness of installing bird-safe glass in a building vs. something like developing new technical techniques to help us avoid losing control of AIs.
What would the justification standards in wild animal welfare say about uncertainty-laden decisions that involve neither AI nor animals: e.g. as a government, deciding which policies to enact, or as a US citizen, deciding who to vote for President?
Coda: to your âwhy should justification standards be the sameâ question, Iâd just want to say Iâm very interested in maintaining the ideal that EAs compare and debate these things; thanks for writing this!
No one is dying of not reading Proust, but many people are leading hollower and shallower lives because the arts are so inaccessible.
Tangential to your main point, and preaching to the choir, but⌠why are âthe artsâ âinaccessible?â The Internet is a huge revolution in the democratization of art relative to most of human history, TV dramas are now much more complex and interesting than they have been in the past, A24 is pumping out tons of weird/âinteresting movies, way more people are making interesting music and distributing it than before.
I think (and this is a drive-by comment, I havenât read the article), the author is conflating âserious literatureâ â often an acquired taste that people need to get from a class or similar â with all of âthe arts.â I studied literature in college, read poetry and e.g. Tolstoy in my free time now, yada yada â and I think this is extremely paternalistic.
I think thereâs value in someone teaching you to enjoy Proust, and indeed wish more people had access to that sort of thing. But I donât think it comes anywhere close to deserving the kind of uniquely elevated position over other forms of artistic production that literature professors etc sometimes (not always) want to give it, and which I feel is on display in this quote.
In any case, the obvious thing to do is ask whether the beneficiaries would prefer more soup or more Proust.
Vince Gilligan (the Breaking Bad guy) has a new show Pluribus which is many things, but also illustrates an important principle, that being (not a spoiler I think since it happens in the first 10 minutes)...
If you are SETI and you get an extraterrestrial signal which seems to code for a DNA sequence...
DO NOT SYNTHESIZE THE DNA AND THEN INFECT A BUNCH OF RATS WITH IT JUST TO FIND OUT WHAT HAPPENS.
Just donât. Not a complicated decision. All you have to do is go from âI am going to synthesize the space sequenceâ to ânopeâ and look at that, x-risk averted. Youâre a hero. Incredible work.
One note: I think it would be easy for this post to be read as âEA should be all about AGIâ or âEA is only for people who are focused on AGI.â
I donât think that is or should be true. I think EA should be for people who care deeply about doing good, and who embrace the principles as a way of getting there. The empirics should be up for discussion.
Appreciated this a lot, agree with much of it.
I think EAs and aspiring EAs should try their hardest to incorporate every available piece of evidence about the world when deciding what to do and where to focus their efforts. For better or worse, this includes evidence about AI progress.
The list of important things to do under the âtaking AI seriouslyâ umbrella is very large, and the landscape is underexplored so there will likely be more things for the list in due time. So EAs who are already working âin AI safetyâ shouldnât feel like their cause prioritization is over and done. AI safety is not the end of cause prio.
People interested in funding for field-building projects for the topics on Willâs menu above can apply to my team at Open Philanthropy here, or contact us here â we donât necessarily fund all these areas, but weâre open to more of them than we receive applications for, so itâs worth asking.
Thanks for writing this Arden! I strong upvoted.
I do my work at Open Phil â funding both AIS and EA capacity-building â because Iâm motivated by EA. I started working on this in 2020, a time when there were way fewer concrete proposals for what to do about averting catastrophic AI risks & way fewer active workstreams. It felt like EA was necessary just to get people thinking about these issues. Now the catastrophic AI risks field is much larger and somewhat more developed, as you point out. And so much the better for the world!
But it seems so far from the case that EA-style thinking is âdoneâ with regard to TAI. This would mean weâve uncovered every new consideration & workstream that could/âshould be worked on in the years before we are obsoleted by AIs. This sounds so unlikely given how huge and confusing the TAI transition would be.
EAs are characteristic in their moral focus plus their flexibility in what they work on. I like your phrasing here of âconstantly up for re-negotiation,â which imo names a distinctively EA trait. To add to your list, I think EA-style thought is also characteristic in its ambition and focus on the truth (even in very confusing/âcontentious domains). I think EAs in the AI safety field are still person-for-person outperforming, e.g. in founding new helpful AI safety research agendas. And I think our success is in large part due to the characteristics I mention above. This seems like a pretty robust dynamic so I expect it to continue at least in the medium term.
(And overall, my guess is distinctively EA characteristics will become more important as the project of TAI preparation becomes more multifaceted and confusing.)
Am I right that a bunch of the content of this response itself was written by an AI?
I enjoyed this, in particular:
the inner critic is actually a kind of ego trip
which resonates for me.
I personally experience my inner critic as something which often prevents me from âseeing the world clearlyâ including seeing good things Iâve done clearly, and seeing my action space clearly. Itâs odd that this is true, because youâd think the point of criticism is to check optimism and help us see things more clearly. And I find this to be very true of other peopleâs criticism, and for some mental modes of critiquing my own plans.
But the distinct flavor of self-critique Iâd call âmy inner criticâ is a weird little guy who seems mostly to be interested in bucketing me as âa good personâ or âa bad person.â This isnât helpful, and it fluctuates week to week, which is also not helpful. It also tends to be overweight social evidence as opposed to other types of evidence. It seems self-obsessed. I would not hire it.
And placing some weight on the prediction that the curve will simply continue[1] seems like a useful heuristic /â counterbalance (and has performed well).
âand has performed wellâ seems like a good crux to zoom in on; for which reference class of empirical trends is this true, and how true is it?
Itâs hard to disagree with âplace some weightâ; imo it always makes sense to have some prior that past trends will continue. The question is how much weight to place on this heuristic vs. more gears-level reasoning.
For a random example, observers in 2009 might have mispredicted Spanish GDP over the next ten years if they placed a lot of weight on this prior.
Iâm skeptical of an âexponentials generally continueâ prior which is supposed to apply super-generally. For example, hereâs a graph of world population since 1000 AD; itâs an exponential, but actually there are good mechanistic reasons to think it wonât continue along this trajectory. Do you think itâs very likely to?
Awesome job, Rob and team.
It turns out, these managed hives, theyâre just un-bee-leave-able.
Iâm quite excited about EAs making videos about EA principles and their applications, and I think this is an impactful thing for people to explore. It seems quite possible to do in a way that doesnât compromise on idea fidelity; I think sincerity counts for quite a lot. In many cases I think videos and other content can be lighthearted /â fun /â unserious and still transmit the ideas well.