Scott Aaronson and Giulio Tononi (the main advocate of IIT) and others had an interesting exchange on IIT which goes into the details more than Muehlhauser’s report does. (Some of it is cited and discussed in the footnotes of Muehlhauser’s report, so you may well be aware of it already.) Here, here and here.
Great—I’m glad you agree!
I do have some reservations about (variance) normalisation, but it seems like a reasonable approach to consider. I haven’t thought about this loads though, so this opinion is not super robust.
Just to tie it back to the original question, whether we prioritise x-risk or WAS will depend on the agents who exist, obviously. Because x-risk mitigation is plausibly much more valuable on totalism than WAS mitigation is on other plausible views, I think you need almost everyone to have very very low (in my opinion, unjustifiably low) credence in totalism for your conlusion to go through. In the actual world, I think x-risk still wins. As I suggested before, it could be the case that the value of x-risk mitigation is not that high or even negative due to s-risks (this might be your best line of argument for your conclusion), but this suggests prioritising large scale s-risks. You rightly pointed out that million years of WAS is the most concrete example of s-risk we currently have. It seems plausible that other and larger s-risks could arise in the future (e.g. large scale sentient simulations), which though admittedly speculative, could be really big in scale. I tend to think general foundational research aiming at improving the trajectory of the future is more valuable to do today than WAS mitigation. What I mean by ‘general foundational research’ is not entirely clear, but, for instance, thinking about and clarifying that seems more important than WAS mitigation.
I’m making a fresh comment to make some different points. I think our earlier thread has reached the limit of productive discussion.
I think your theory is best seen as a metanormative theory for aggregating both well-being of existing agents and the moral preferences of existing agents. There are two distinct types of value that we should consider:
prudential value: how good a state of affairs is for an agent (e.g. their level of well-being, according to utilitarianism; their priority-weighted well-being, according to prioritarianism).
moral value: how good a state of affairs is, morally speaking (e.g. the sum of total well-being, according to totalism; or the sum of total priority-weighted well-being, according to prioritarianism).
The aim of a population axiology is to determine the moral value of state of affairs in terms of the prudential value of the agents who exist in that state of affairs. Each agent can have a preference order on population axiologies, expressing their moral preferences.
We could see your theory as looking at the prudential of all the agents in a state of affairs (their level of well-being) and their moral preferences (how good they think the state of affairs is compared to other state of affairs in the choice set). The moral preferences, at least in part, determine the critical level (because you take into account moral intuitions, e.g. that the sadistic repugnant conclusion is very bad, when setting critical levels). So the critical level of an agent (on your view) expresses moral preferences of that agent. You then aggregate the well-being and moral preferences of agents to determine overall moral value—you’re aggregating not just well-being, but also moral preferences, which is why I think this is best seen as a metanormative theory.
Because the critical level is used to express moral preferences (as opposed to purely discounting well-being), I think it’s misleading and the source of a lot of confusion to call this a critical level theory—it can incorporate critical level theories if agents have moral preferences for critical level theories—but the theory is, or should be, much more general. In particular, in determining the moral preferences of agents, one could (and, I think, should) take normative uncertainty into account, so that the ‘critical level’ of an agent represents their moral preferences after moral uncertainty. Aggregating these moral preferences means that your theory is actually a two-level metanormative theory: it can (and should) take standard normative uncertainty into account in determining the moral preferences of each agent, and then aggregates moral preferences across agents.
Hopefully, you agree with this characterisation of your view. I think there are now some things you need to say about determining the moral preferences of agents and how they should be aggregated. If I understand you correctly, each agent in a state of affairs looks at some choice set of states of affairs (states of affairs that could obtain in the future, given certain choices?) and comes up with a number representing how good or bad the state of affairs that they are in is. In particular, this number could be negative or positive. I think it’s best just to aggregate moral preferences directly, rather than pretending to use critical levels that we subtract from levels of well-being, and then aggregate ‘relative utility’, but that’s not an important point.
I think the coice-set dependence of moral preferences is not ideal, but I imagine you’ll disagree with me here. In any case, I think a similar theory could specified that doesn’t rely on this choice-set dependence, though I imagine it might be harder to avoid the conclusions you aim to avoid, given choice-set independence. I haven’t thought about this much.
You might want to think more about whether summing up moral preferences is the best way to aggregate them. This form of aggregation seems vulnerable to extreme preferences that could dominate lots of mild preferences. I haven’t thought much about this and don’t know of any literature on this directly, but I imagine voting theory is very relevant here. In particular, the theory I’ve described looks just like a score voting method. Perhaps, you could place bounds on scores/moral preferences somehow to avoid the dominance of very strong preferences, but it’s not immediately clear to me how this could be done justifiably.
It’s worth noting that the resulting theory won’t avoid the sadistic repugnant conclusion unless every agent has very very strong moral preferences to avoid it. But I think you’re OK with that. I get the impression that you’re willing to accept it in increasingly strong forms, as the proportion of agents who are willing to accept it increases.
I’m not entirely sure what you mean by ‘rigidity’, but if it’s something like ‘having strong requirements on critical levels’, then I don’t think my argument is very rigid at all. I’m allowing for agents to choose a wide range of critical levels. The point is though, that given the well-being of all agents and critical levels of all agents except one, there is a unique critical level that the last agent has to choose, if they want to avoid the sadistic repugnant conclusion (or something very similar). At any point in my argument, feel free to let agents choose a different critical level to the one I have suggested, but note that doing so leaves you open to the sadistic repugnant conclusion. That is, I have suggested the critical levels that agents would choose, given the same choice set and given that they have preferences to avoid the sadistic repugnant conclusion.
Sure, if k is very low, you can claim that A is better than Bq, even if q is really really big. But, keeping q fixed, there’s a k (e.g. 10^10^10) such that Bq is better than A (feel free to deny this, but then your theory is lexical). Then at some point (assuming something like the continuity), there’s a k such that A and Bq are equally good. Call this k’. If k’ is very low, then you get the sadistic repugnant conclusion. If k’ is very high, you face the same problems as lexical theories. If k’ not too high or low, you strike a compromise that makes the conclusions of each less bad, but you face both of them, so it’s not clear this is preferable. I should note that I thought of and wrote up my argument fairly quickly and quite late last night, so it could be wrong and is worth checking carefully, but I don’t see how what you’ve said so far refutes it.
My earlier points relate to the strangeness of the choice set dependence of relative utility. We agree that well-being should be choice set independent. But by letting the critical level be choice set dependent, you make relative utility choice set dependent. I guess you’re OK with that, but I find that undesirable.
Thanks for the reply!
I agree that it’s difficult to see how to pick a non-zero critical level non-arbitrarily—that’s one of the reasons I think it should be zero. I also agree that, given critical level utilitarianism, it’s plausible that the critical level can vary across people (and across the same person at different times). But I do think that whatever the critical level for a person in some situation is, it should be independent of other people’s well-being and critical levels. Imagine two scenarios consisting of the same group of people: in each, you have have the exact same life/experiences and level of well-being, say, 5; you’re causally isolated from everyone else; the other people have different levels of well-being and different critical levels in each scenario such that in the first scenario, the aggregate of their moral value (sum well-being minus critical level for each person) is 1, and in the second this quantity is 7. If I’ve understood you correctly, in the first case, you should set your critical level to 6 - a, and in the second you should set it to 12 - a, where a is infinitesimal, so that the total moral value in each case is a, so that you avoid the sadistic repugnant conclusion. Why have a different level in each case? You aren’t affected by anyone else—if you were, you would be in a different situation/live a different life so could maybe justify a different critical level. But I don’t see how you can justify that here.
This relates to my point on it seeming ad hoc. You’re selecting your critical level to be the number such that when you aggregate moral value, you get an infinitesimal so that you avoid the sadistic repugnant conclusion, without other justification for setting the critical level at that level. That strikes me as ad hoc.
I think you introduce another element of arbitrariness too. Why set your critical level to 12 - a, when the others could set theirs to something else such that you need only set yours to 10 - a? There are multiple different critical levels you could set yours to, if others change theirs too, that give you the result you want. Why pick one solution over any other?
Finally, I don’t think you really avoid the problems facing lexical value theories, at least not without entailing the sadistic repugnant conclusion. This is a bit technical. I’ve edited it to make it as clear as I can, but I think I need to stop now; I hope it makes sense. The main idea is to highlight a trade-off you have to make between avoiding the repugnant conclusion and avoiding the counter-intuitive implications of lexical value theories.
Let’s go with your example: 1 person at well-being −10, critical level 5; 1 person at well-being 30, so they set their critical level to 15 - a, so that the overall moral value is a. Now suppose:
(A) We can improve the first person’s well-being to 0 and leave the second person at 30, or
(B) We can improve the second person’s well-being to 300,000 and leave the first person at −10.
Assume the first person keeps their critical level at 5 in each case. If I’ve understood you correctly, in the first case, the second person should set their critical level to 25 - b, so that the total moral value is an infinitesimal, b; and in the second case, they should set it to 299,985 - c, so that again, the total moral value is an infinitesimal, c. If b > c or b = c, we get the problems facing lexical theories. So let’s say we choose b and c such that c > b. But if we also consider:
(C) improving the second person’s well-being to 31 and leave the first person at −10
We choose critical level 16 - d, I assume you want b > d because I assume you want to say that (C) is worse than (A). So if x(n) is the infinitesimal used when we can increase the second person’s well-being to n, we have x(300,000) > b > x(31). At some point, we’ll have m such that x(m+1) > b > x(m) (assuming some continuity, which I think is very plausible), but for simplicity, let’s say there’s an m such that x(m) = b. For concreteness, let’s say m = 50, so that we’re indifferent between increasing the second person’s well-being to 50 and increasing the first person’s to 0.
Now for a positive integer q, consider:
(Bq) We have q people at positive well-being level k, and the first person at well-being level −10.
Repeating the above procedure (for fixed q, letting k vary), there’s a well-being level k(q) such that we’re indifferent between (A) and (Bq). We can do this for each q. Then let’s say k(2) = 20, k(4) = 10, k(10) = 4, k(20) = 2, k(40) = 1 and so on… (This just gives the same ordering as totalism in these cases; I just chose factors of 40 in that sequence to make the arithmetic nice.) This means we’re indifferent between (A) and 40 people at well-being 1 with one person at −10, so we’d rather have 41 people at 1 and one person at −10 than (A). Increasing 41 allows us to get the same result with well-being levels even lower than 1 -- so this is just the sadistic repugnant conclusion. You can make it less bad by discounting positive well-being, but then you’ll inherit the problems facing lexical theories. Say you discount so that as q (the number of people) tends to infinity, the well-being level at which you’re indifferent with (A) tends to some positive number—say 10. Then 300,000 people at level 10 and one person at level −10 is worse than (A). But that means you face the same problem as lexical theories because you’ve traded vast amounts of positive well-being for a relatively small reduction in negative well-being. The lower you let this limit be, the closer you get to the sadistic repugnant conclusion, and the higher you let it be, the more your theory looks like lexical negative utilitarianism. You might try to get round this by appealing to something like vagueness/indeterminacy or incommensurability, but these approaches also have counter-intuitive results.
You’re theory is an interesting way to avoid the repugnant conclusions, and in some sense, it strikes a nice balance between totalism and lexical negative utilitarianism, but it also inherits the weaknesses of at least one of them. And I must admit, I find the complete subjectiveness of the critical levels bizarre and very hard to stomach. Why not just drop the messy and counter-intuitive subjectively set variable critical level utilitarianism and prefer quasi-negative utilitarianism based on lexical value? As we’ve both noted, that view is problematic, but I don’t think it’s more problematic than what you’re proposing and I don’t think its problems are absolutely devastating.
Nice post! I enjoyed reading this but I must admit that I’m a bit sceptical.
I find your variable critical level utilitarianism troubling. Having a variable critical level seems OK in principle, but I find it quite bizarre that moral patients can choose what their critical value is i.e. they can choose how morally valuable their life is. How morally good or bad a life is doesn’t seem to be a matter of choice and preferences. That’s not to say people can’t disagree about where the critical level should be, but I don’t see why this disagreement should reflect a difference in individual’s own critical levels—plausibly these disagreements are about other people’s as well. In particular, you’ll have a very hard time convincing anyone who takes morality to be mind-independent to accept this view. I would find the view much more plausible if the critical level were determined for each person by some other means.
I’d be interested to hear what kind of constraints you’d suggest on choosing levels. If you don’t allow any, then I am free to choose a low negative critical level and live a very painful life, and this could be morally good. But that’s more absurd than the sadistic repugnant conclusion, so you need some constraints. You seem to want to allow people the autonomy to choose their own critical level but also require that everyone chooses a level that is infinitesimally less than their welfare level in order to avoid the sadistic repugnant conclusion—there’s a tension here that needs resolved. But also, I don’t see how you can use the need to avoid the sadistic repugnant conclusion as a constraint for choosing critical levels without being really ad hoc.
I think you’d be better arguing for quasi-negative utilitarianism directly or in some other way: you might claim that all positive welfare is only of infinitesimal moral value but that (at least some) suffering is of non-infinitesimal moral disvalue. It’s really difficult to get this to work though, because you’re introducing value lexicality, i.e. some suffering is infinitely worse than any amount of happiness. This implies that you would prefer to relieve a tiny amount of non-infinitesimal suffering over experiencing any finite amount of happiness. And plausibly you’d prefer to avoid a tiny but non-infinitesimal chance of a tiny amount of non-infinitesimal suffering over a guaranteed experience of any finite amount of happiness. This seems more troubling than the sadistic repugnant conclusion to me. I think you can sweeten the pill though by setting the bar of non-infinitesimal suffering quite high e.g. being eaten alive. This would allow trade-offs between most suffering and happiness as usual (allowing the sadistic repugnant conclusion concerning happiness and the ‘lesser’ forms of suffering) but still granting lexical superiority to extreme suffering. This strikes me as the most plausible view in this region of population ethical theories, I’d be interested to hear what you think.
Even if you get a plausible version of quasi-negative utilitarianism (QNU) that favours WAS over x-risk, I don’t think the conclusion you want will follow easily when moral uncertainty is taken into account. How do you propose to decide what to do under normative uncertainty? Even if you find quasi-negative utilitarianism (QNU) more plausible than classical utilitarianism (CU), it doesn’t follow that we should prioritise WAS unless you take something like the ‘my favourite theory’ approach to normative uncertainty, which is deeply unsatisfying. The most plausible approaches to normative uncertainty (e.g. ‘maximise expected choice-worthiness’) take both credences in the relevant theories and the value the theories assign to outcomes into account. If the expected value of working on x-risk according to CU is many times greater than the expected value of working on WAS according to QNU (which is plausible), then all else being equal, you need your credence in QNU to be many times greater than your credence in CU. We could easily be looking at a factor of 1000 here, which would require something like a credence < 0.1 in CU, but that’s surely way too low, despite the sadistic repugnant conclusion.
A response you might make is that the expected value of preventing x-risk according to CU is actually not that high (or maybe even negative), due to increased chances of s-risks, given that we don’t go extinct. But if this is the case, we’re probably better off focusing on those s-risks rather than WAS, since they’d have to be really really big to bring x-risk mitigation down to WAS level on CU. It’s possible that working on WAS today is a good way to gain information and improve our chances of good s-risk mitigation in the future, especially since we don’t know very much about large s-risks and don’t have experience mitigating them. But I think it would be suspiciously convenient if working on WAS now turned out to be the best thing for future s-risk mitigation (even on subjective expected value terms given our current evidence). I imagine we’d be better off working on large scale s-risks directly.