Econ PhD student at Oxford and research associate at the Global Priorities Institute. I’m slightly less ignorant about economic theory than about everything else.
trammell
Ok—at Toby’s encouragement, here are my thoughts:
This is a very old point, but to my mind, at least from a utilitarian perspective, the main reason it’s worth working on promoting AI welfare is the risk of foregone upside. I.e. without actively studying what constitutes AI welfare and advocating for producing it, we seem likely to have a future that’s very comfortable for ourselves and our descendants—fully automated luxury space communism, if you like—but which contains a very small proportion of the value that could have been created by creating lots of happy artificial minds. So concern for creating AI welfare seems likely to be the most important way in which utilitarian and human-common-sense moral recommendations differ.
It seems to me that the amount of value we could create if we really optimized for total AI welfare is probably greater than the amount of disvalue we’ll create if we just use AI tools and allow for suffering machines by accident, since in the latter case the suffering would be a byproduct, not something anyone optimizes for.
But AI welfare work (especially if this includes moral advocacy) just for the sake of avoiding this downside also seems valuable enough to be worth a lot of effort on its own, even if suffering AI tools are a long way off. The animal analogy seems relevant: it’s hard to replace factory farming once people have started eating a lot of meat, but in India, where Hinduism has discouraged meat consumption for a long time, less meat is consumed and so factory farming is evidently less widespread.
So in combination, I expect AI welfare work of some kind or another is probably very important. I have almost no idea what the best interventions would be or how cost-effective they would be, so I have no opinion on exactly how much work should go into them. I expect no one really knows at this point. But at face value the topic seems important enough to warrant at least doing exploratory work until we have a better sense of what can be done and how cost-effective it could be, only stopping in the (I think unlikely) event that we can say with some confidence that the best AI welfare work to be done is worse than the best work that can be done in other areas.
- 5 Jul 2024 15:38 UTC; 15 points) 's comment on Discussion Thread: AI Welfare Debate Week by (
The point that it’s better to save people with better lives than people with worse lives, all else equal, does make sense (at least from a utilitarian perspective). So you’re right that [$ / lives saved] is not a perfect approach. I do think it’s worth acknowledging this...!
But the right correction isn’t to use VSLs. The way I’d put it is: a person’s VSL—assuming it’s been ideally calculated for each individual, putting aside issues about how governments estimate it in practice—is how many dollars they value as much as slightly lowering their chance of death. So the fact that VSLs differ across people mixes together two things: a rich person might have a higher VSL than a poor person (1) because the rich person values their life more, or (2) because the rich person values a dollar less. The first thing is right to correct for (from a utilitarian perspective), but as other commenters have noted, the second isn’t.
My guess is that the second factor baked into the VSL is bigger in most real-world comparisons we might want to make, so that it’s less of a mistake to just try to maximize [$ / lives saved] than to try to maximize [$ / (lives saved * VSL)].
Ah I see, sorry. Agreed
I don’t follow—are you saying that (i) AI safety efforts so far have obviously not actually accomplished much risk-reduction, (ii) that this is largely for risk compensation reasons, and (iii) that this is worth emphasizing in order to prevent us from carrying on the same mistakes?
If so, I agree that if (i)-(ii) are true then (iii) seems right, but I’m not sure about (i) and (ii). But if you’re just saying that it would be good to know whether (i)-(ii) are true because if they are then it would be good to do (iii), I agree.
Whoops, thanks! Issues importing from the Google doc… fixing now.
Good to hear, thanks!
I‘ve just edited the intro to say: it’s not obvious to me one way or the other whether it’s a big deal in the AI risk case. I don’t think I know much about the AI risk case (or any other case) to have much of an opinion, and I certainly don’t think anything here is specific enough to come to a conclusion in any case. My hope is just that something here makes it easier to for people who do know about particular cases to get started thinking through the problem.
If I have to make a guess about the AI risk case, I’d emphasize my conjecture near the end, just before the “takeaways” section, namely that (as you suggest) there currently isn’t a ton of restraint, so (b) mostly fails, but that this has a good chance of changing in the future:
Today, while even the most advanced AI systems are neither very capable nor very dangerous, safety concerns are not constraining much below . If technological advances unlock the ability to develop systems which offer utopia if their deployment is successful, but which pose large risks, then the developer’s choice of at any given is more likely to be far below
, and the risk compensation induced by increasing is therefore more likely to be strong.If lots/most of AI safety work (beyond evals) is currently acting more “like evals” than like pure “increases to S”, great to hear—concern about risk compensation can just be an argument for making sure it stays that way!
Thanks for noting this. If in some case there is a positive level of capabilities for which P is 1, then we can just say that the level of capabilities denoted by C = 0 is the maximum level at which P is still 1. What will sort of change is that the constraint will be not C ≥ 0 but C ≥ (something negative), but that doesn’t really matter since here you’ll never want to set C<0 anyway. I’ve added a note to clarify this.
Maybe a thought here is that, since there is some stretch of capabilities along which P=1, we should think that P(.) is horizontal around C=0 (the point at which P can start falling from 1) for any given S, and that this might produce very different results from the example in which there would be a kink at C=0. But no—the key point is whether increases to S change the curve in a way that widens as C moves to the right, and so “act as price decreases to C”, not the slope of the curve around C=0. E.g. if (for , and 0 above), then in the k=0 case where the lab is trying to maximize , they set , and so P is again fixed (here, at 2⁄3) regardless of S.
Notes on risk compensation
Hey David, I’ve just finished a rewrite of the paper which I’m hoping to submit soon, which I hope does a decent job of both simplifying it and making clearer what the applications and limitations are: https://philiptrammell.com/static/Existential_Risk_and_Growth.pdf
Presumably the referees will constitute experts on the growth front at least (if it’s not desk rejected everywhere!), though the new version is general enough that it doesn’t really rely on any particular claims about growth theory.
Hold on, just to try wrapping up the first point—if by “flat” you meant “more concave”, why do you say “I don’t see how [uncertainty] could flatten out the utility function. This should be in “Justifying a more cautious portfolio”?”
Did you mean in the original comment to say that you don’t see how uncertainty could make the utility function more concave, and that it should therefore also be filed under “Justifying a riskier portfolio”?
I can’t speak for Michael of course, but as covered throughout the post, I think that the existing EA writing on this topic has internalized the pro-risk-tolerance points (e.g. that some other funding will be coming from uncorrelated sources) quite a bit more than the anti-risk-tolerance points (e.g. that some of the reasons that many investors seem to value safe investments so much, like “habit formation”, could apply to philanthropists to some extent as well). If you feel you and some other EAs have already internalized the latter more than the former, then that’s great too, as far as I’m concerned—hopefully we can come closer to consensus about what the valid considerations are, even if from different directions.
By flattening here, I meant “less concave”—hence more risk averse. I think we agree on this point?
Less concave = more risk tolerant, no?
I think I’m still confused about your response on the second point too. The point of this section is that since there are no good public estimates of the curvature of the philanthropic utility function for many top EA cause areas, like x-risk reduction, we don’t know if it’s more or less concave than a typical individual utility function. Appendix B just illustrates a bit more concretely how it could go either way. Does that make sense?
Thanks! As others have commented, the strength of this consideration (and of many of the other considerations) is quite ambiguous, and I’d love to see more research on it. But at least qualitatively, I think it’s been underappreciated by existing discussion.
Thanks! Hardly the first version of an article like this (or most clearly written), but hopefully a bit more thorough…!
I agree! As noted under Richard’s comment, I’m afraid my only excuse is that the points covered are scattered enough that writing a short, accessible summary at the top was a bit of a pain, and I ran out of time to write this before I could make it work. (And I won’t be free again for a while…)
If you or anyone else reading this manages to write one in the meantime, send it over and I’ll stick it at the top.
Thanks! I agree that would be helpful. My only excuse is that the points covered are scattered enough that writing a short, accessible summary at the top was a bit of a pain, and I ran out of time to write this before I could make it work…
Hi Peter, thanks again for your comments on the draft! I think it improved it a lot. And sorry for the late reply here—just got back from vacation.
I agree that the cause variety point includes what you might call “sub-cause variety” (indeed, I changed the title of that bit from “cause area variety” to “cause variety” for that reason). I also agree that it’s a really substantial consideration: one of several that can single-handedly swing the conclusion. I hope you/others find the simple model of Appendix C helpful for starting to quantify just how substantial it is. My own current feeing is that it’s more substantial than I thought when I first started thinking about this question, though not enough to unambiguously outweigh countervailing considerations, like the seemingly unusually high beta of EA-style philanthropic funding.
I also agree that the long-run correlations between asset returns and the consumption of the global poor seems like an important variable to look into more insofar as we’re thinking about the global poverty context, and that it could turn out to be weak enough that using an effective eta<1 is warranted even if we’re operating on a long time horizon.
Hi, sorry for the late reply—just got back from vacation.
As with most long posts, I expect this post has whatever popularity it has not because many people read it all, but because they skimmed parts and thought they made sense, and thought the overall message resonated with their own intuitions. Likewise, I expect your comment has whatever popularity it has because they have different intuitions, and because it looks on a skim as though you’ve shown that a careful reading of the post validates those intuitions instead…! But who knows.
Since there are hard-to-quantify considerations both for and against philanthropists being very financially risk tolerant, if your intuitions tend to put more weight on the considerations that point in the pro-risk-tolerance direction, you can certainly read the post and still conclude that a lot of risk tolerance is warranted. E.g. my intuition differs from yours at the top of this comment. As Michael Dickens notes, and as I say in the introduction, I think the post argues on balance against adopting as much financial risk tolerance as existing EA discourse tends to recommend.
Beyond an intuition-based re-weighting of the considerations, though, you raise questions about the qualitative validity of some of the points I raise. And as long as your comment is, I think the post does already address essentially all these questions. (Indeed, addressing them in advance is largely why the post is as long as it is!) For example, regarding “arguments from uncertainty”, you say
I don’t see how this could flatten out the utility function. This should be in “Justifying a more cautious portfolio”.
But to my mind, the way this flattening could work is explained in the “Arguments from uncertainty” section:
“one might argue that philanthropists have a hard time distinguishing between the value of different projects, and that this makes the “ex ante philanthropic utility function”, the function from spending to expected impact, less curved than it would be under more complete information…”
Or, in response to my point that “The philanthropic utility function for any given “cause” could exhibit more or less curvature than a typical individual utility function”, you say
I don’t find any argument convincing that philanthropic utility functions are more curved than typical individuals. (As I’ve noted above where you’ve attempted to argue this. This should be in “Justifying a riskier portfolio”, .
Could you point me to what you’re referring to, when you say you note this above? To my mind, one way that a within-cause philanthropic utility function could exhibit arbitrarily more curvature than a typical individual utility function is detailed in Appendix B.
So I can better understand what might be going on with all these evident failures of communication on my end more generally, instead of producing an ever-lengthening series of point by point replies, could you say more about why you don’t feel your questions are answered in these cases?
Against much financial risk tolerance
Yup!
To my mind, the first point applies to whatever resources are used throughout the future, whether it’s just the earth or some larger part of the universe.
I agree that the number/importance of welfare subjects in the future is a crucial consideration for how much to do longtermist as opposed to other work. But when comparing longtermist interventions—say, splitting a budget between lowering the risk of the world ending and proportionally increasing the fraction of resources devoted to creating happy artificial minds—it would seem to me that the “size of the future” typically multiplies the value of both interventions equally, and so doesn’t matter.