Here, you say, “Several of the grants we’ve made to Rethink Priorities funded research related to moral weights.” Yet in your initial response, you said, “We don’t use Rethink’s moral weights.” I respect your tapping out of this discussion, but at the same time I’d like to express my puzzlement as to why Open Phil would fund work on moral weights to inform grantmaking allocation, and then not take that work into account.
One can value research and find it informative or worth doing without being convinced of every view of a given researcher or team. Open Philanthropy also sponsored a contest to surface novel considerations that could affect its views on AI timelines and risk. The winners mostly present conclusions or considerations on which AI would be a lower priority, but that doesn’t imply that the judges or the institution changed their views very much in that direction.
At large scale, Information can be valuable enough to buy even if it only modestly adjusts proportional allocations of effort, the minimum bar for funding a research project with hundreds of thousands or millions of dollars presumably isn’t that one pivots billions of dollars on the results with near-certainty.
Thank you for engaging. I don’t disagree with what you’ve written; I think you have interpreted me as implying something stronger than what I intended, and so I’ll now attempt to add some colour.
That Emily and other relevant people at OP have not fully adopted Rethink’s moral weights does not puzzle me. As you say, to expect that is to apply an unreasonably high funding bar. I am, however, puzzled that Emily and co. appear to have not updated at all towards Rethink’s numbers. At least, that’s the way I read:
We don’t use Rethink’s moral weights.
Our current moral weights, based in part on Luke Muehlhauser’s past work, are lower. We may update them in the future; if we do, we’ll consider work from many sources, including the arguments made in this post.
If OP has not updated at all towards Rethink’s numbers, then I see three possible explanations, all of which I find unlikely, hence my puzzlement. First possibility: the relevant people at OP have not yet given the Rethink report a thorough read, and have therefore not updated. Second: the relevant OP people have read the Rethink report, and have updated their internal models, but have not yet gotten around to updating OP’s actual grantmaking allocation. Third: OP believes the Rethink work is low quality or otherwise critically corrupted by one or more errors. I’d be very surprised if one or two are true, given how moral weight is arguably the most important consideration in neartermist grantmaking allocation. I’d also be surprised if three is true, given how well Rethink’s moral weight sequence has been received on this forum (see, e.g., comments here and here).[1] OP people may disagree with Rethink’s approach at the independent impression level, but surely, given Rethink’s moral weights work is the most extensive work done on this topic by anyone(?), the Rethink results should be given substantial weight—or at least non-trivial weight—in their all-things-considered views?
(If OP people believe there are errors in the Rethink work that render the results ~useless, then, considering the topic’s importance, I think some sort of OP write-up would be well worth the time. Both at the object level, so that future moral weight researchers can avoid making similar mistakes, and to allow the community to hold OP’s reasoning to a high standard, and also at the meta level, so that potential donors can update appropriately re. Rethink’s general quality of work.)
Additionally—and this is less important, I’m puzzled at the meta level at the way we’ve arrived here. As noted in the top-level post, Open Phil has been less than wholly open about its grantmaking, and it’s taken a pretty not-on-the-default-path sequence of events—Ariel, someone who’s not affiliated with OP and who doesn’t work on animal welfare for their day job, writing this big post; Emily from OP replying to the post and to a couple of the comments; me, a Forum-goer who doesn’t work on animal welfare, spotting an inconsistency in Emily’s replies—to surface the fact that OP does not give Rethink’s moral weights any weight.
Our current moral weights, based in part on Luke Muehlhauser’s past work, are lower. We may update them in the future; if we do, we’ll consider work from many sources, including the arguments made in this post.
Interestingly and confusingly, fitting distributions to Luke’s 2018 guesses for the 80 % prediction intervals of the moral weight of various species, one gets mean moral weights close to or larger than 1:
It is also worth noting that Luke seemed very much willing to update on further research in 2022. Commenting on the above, Luke said (emphasis mine):
Since this exercise is based on numbers I personally made up, I would like to remind everyone that those numbers are extremely made up and come with many caveats given in the original sources. It would not be that hard to produce numbers more reasonable than mine, at least re: moral weights. (I spent more time on the “probability of consciousness” numbers, though that was years ago and my numbers would probably be different now.)
Welfare ranges are a crucial input to determining moral weights, so I assume Luke would also have agreed that it would not have been that hard to produce more reasonable welfare ranges than his and Open Phil’s in 2022. So, given how little time Open Phil seemingly devoted to assessing welfare ranges in comparison to Rethink, I would have expected Open Phil to give major weight to Rethink’s values.
I can’t speak for Open Philanthropy, but I can explain why I personally was unmoved by the Rethink report (and think its estimates hugely overstate the case for focusing on tiny animals, although I think the corrected version of that case still has a lot to be said for it).
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
However, I say “naively” because this doesn’t actually work, due to two-envelope effects...whenever you’re tempted to multiply such numbers by something, remember two-envelope effects!)
[Edited for clarity] I was not satisfied with Rethink’s attempt to address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around.
It is not unthinkably improbable that an elephant brain where reinforcement from a positive or negative stimulus adjust millions of times as many neural computations could be seen as vastly more morally important than a fruit fly, just as one might think that a fruit fly is much more important than a thermostat (which some suggest is conscious and possesses preferences). Since on some major functional aspects of mind there are differences of millions of times, that suggests a mean expected value orders of magnitude higher for the elephant if you put a bit of weight on the possibility that moral weight scales with the extent of, e.g. the computations that are adjusted by positive and negative stimuli. A 1% weight on that plausible hypothesis means the expected value of the elephant is immense vs the fruit fly. So there will be something that might get lumped in with ‘overwhelming hierarchicalism’ in the language of the top-level post. Rethink’s variousdiscussions of this issue in my view missed the mark.
Go the other way and fix the value of the elephant at 1, and the possibility that value scales with those computations is treated as a case where the fly is worth ~0. Then a 1% or even 99% credence in value scaling with computation has little effect, and the elephant-fruit fly ratio is forced to be quite high so tiny mind dominance is almost automatic. The same argument can then be used to make a like case for total dominance of thermostat-like programs, or individual neurons, over insects. And then again for individual electrons.
As I see it, Rethink basically went with the ‘ratios to fixed human value’, so from my perspective their bottom-line conclusions were predetermined and uninformative. But the alternatives they ignore lead me to think that the expected value of welfare for big minds is a lot larger than for small minds (and I think that can continue, e.g. giant AI minds with vastly more reinforcement-affected computations and thoughts could possess much more expected welfare than humans, as many humans might have more welfare than one human).
I agree with Brian Tomasik’s comment from your link:
the moral-uncertainty version of the [two envelopes] problem is fatal unless you make further assumptions about how to resolve it, such as by fixing some arbitrary intertheoretic-comparison weights (which seems to be what you’re suggesting) or using the parliamentary model.
By the same token, arguments about the number of possible connections/counterfactual richness in a mind could suggest superlinear growth in moral importance with computational scale. Similar issues would arise for theories involving moral agency or capacity for cooperation/game theory (on which humans might stand out by orders of magnitude relative to elephants; marginal cases being socially derivative), but those were ruled out of bounds for the report. Likewise it chose not to address intertheoretic comparisons and how those could very sharply affect the conclusions. Those are the kinds of issues with the potential to drive massive weight differences.
I think some readers benefitted a lot from reading the report because they did not know that, e.g. insects are capable of reward learning and similar psychological capacities. And I would guess that will change some people’s prioritization between different animals, and of animal vs human focused work. I think that is valuable. But that information was not new to me, and indeed I had argued for many years that insects met a lot of the functional standards one could use to identify the presence of well-being, and that even after taking two-envelopes issues and nervous system scale into account expected welfare at stake for small wild animals looked much larger than for FAW.
I happen to be a fan of animal welfare work relative to GHW’s other grants at the margin because animal welfare work is so highly neglected (e.g. Open Philanthropy is a huge share of world funding on the most effective FAW work but quite small compared to global aid) relative to the case for views on which it’s great. But for me Rethink’s work didn’t address the most important questions, and largely baked in its conclusions methodologically.
Thanks for your discussion of the Moral Weight Project’s methodology, Carl. (And to everyone else for the useful back-and-forth!) We have some thoughts about this important issue and we’re keen to write more about it. Perhaps 2024 will provide the opportunity!
For now, we’ll just make one brief point, which is that it’s important to separate two questions. The first concerns the relevance of the two envelopes problem to the Moral Weight Project. The second concerns alternative ways of generating moral weights. We considered the two envelopes problem at some length when we were working on the Moral Weight Project and concluded that our approach was still worth developing. We’d be glad to revisit this and appreciate the challenge to the methodology.
However, even if it turns out that the methodology has issues, it’s an open question how best to proceed. We grant the possibility that, as you suggest, more neurons = more compute = the possibility of more intense pleasures and pains. But it’s also possible that more neurons = more intelligence = less biological need for intense pleasures and pains, as other cognitive abilities can provide the relevant fitness benefits, effectively muting the intensities of those states. Or perhaps there’s some very low threshold of cognitive complexity for sentience after which point all variation in behavior is due to non-hedonic capacities. Or perhaps cardinal interpersonal utility comparisons are impossible. And so on. In short, while it’s true that there are hypotheses on which elephants have massively more intense pains than fruit flies, there are also hypotheses on which the opposite is true and on which equality is (more or less) true. Once we account for all these hypotheses, it may still work out that elephants and fruit flies differ by a few orders of magnitude in expectation, but perhaps not by five or six. Presumably, we should all want some approach, whatever it is, that avoids being mugged by whatever low-probability hypothesis posits the largest difference between humans and other animals.
That said, you’ve raised some significant concerns about methods that aggregate over different relative scales of value. So, we’ll be sure to think more about the degree to which this is a problem for the work we’ve done—and, if it is, how much it would change the bottom line.
I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones). E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of the same set of robots and computations with less integration (and overlap of shared features, or top-level control), ceteris paribus.
Indeed, I would be wary of treating the integrated mind as though welfare stakes for it were half or a tenth as great, seeing that as a potential source of moral catastrophe, like ignoring the welfare of minds not based on proteins. E.g. having tasks involving suffering and frustration done by large integrated minds, and pleasant ones done by tiny minds, while increasing the amount of mental activity in the former. It sounds like the combination of object-level and methodological takes attached to these reports would favor ignoring almost completely the integrated mind.
Incidentally, in a world where small animals are being treated extremely badly and are numerous, I can see a temptation to err in their favor, since even overestimates of their importance could be shifting things in the right marginal policy direction. But thinking about the potential moral catastrophes on the other side helps sharpen the motivation to get it right.
In practice, I don’t prioritize moral weights issues in my work, because I think the most important decisions hinging on it will be in an era with AI-aided mature sciences of mind, philosophy and epistemology. And as I have written regardless of your views about small minds and large minds, it won’t be the case that e.g. humans are utility monsters of impartial hedonism (rather than something bigger, smaller, or otherwise different), and grounds for focusing on helping humans won’t be terminal impartial hedonistic in nature. But from my viewpoint baking in that integration (and unified top-level control or mental overlap of some parts of computation) close to eliminates mentality or welfare (vs less integrated collections of computations) seems bad in non-Pascalian fashion.
FWIW, I think something like conscious subsystems (in huge numbers in one neural network) is more plausible by design in future AI. It just seems unlikely in animals because all of the apparent subjective value seems to happen at roughly the highest level where everything is integrated in an animal brain.
Felt desire seems to (largely) be motivational salience, a top-down/voluntary attention control function driven by high-level interpretations of stimuli (e.g. objects, social situations), so relatively late in processing. Similarly, hedonic states depend on high-level interpretations, too.
Or, according to Attention Schema Theory, attention models evolved for the voluntary control of attention. It’s not clear what the value would be for an attention model at lower levels of organization before integration.
And evolution will select against realizing functions unnecessarily if they have additional costs, so we should provide a positive argument for the necessary functions being realized earlier or multiple times in parallel that overcomes or doesn’t incur such additional costs.
So, it’s not that integration necessarily reduces value; it’s that, in animals, all the morally valuable stuff happens after most of the integration, and apparently only once or in small number.
In artificial systems, the morally valuable stuff could instead be implemented separately by design at multiple levels.
EDIT:
I think there’s still crux about whether realizing the same function the same number of times but “to a greater degree” makes it more morally valuable. I think there are some ways of “to a greater degree” that don’t matter, and some that could. If it’s only sort of (vaguely) true that a system is realizing a certain function, or it realizes some but not all of the functions possibly necessary for some type of welfare in humans, then we might discount it for only meeting lower precisifications of the vague standards. But adding more neurons just doing the same things:
doesn’t make it more true that it realizes the function or the type of welfare (e.g. adding more neurons to my brain wouldn’t make it more true that I can suffer),
doesn’t clearly increase welfare ranges, and
doesn’t have any other clear reason for why it should make a moral difference (I think you disagree with this, based on your examples).
But maybe we don’t actually need good specific reasons to assign non-tiny probabilities to neuron count scaling for 2 or 3, and then we get domination of neuron count scaling in expectation, depending on what we’re normalizing by, like you suggest.
This consideration is something I had never thought of before and blew my mind. Thank you for sharing.
Hopefully I can summarize it (assuming I interpreted it correctly) in a different way that might help people who were as befuddled as I was.
The point is that, when you have probabilistic weight to two different theories of sentience being true, you have to assign units to sentience in these different theories in order to compare them.
Say you have two theories of sentience that are similarly probable, one dependent on intelligence and one dependent on brain size. Call these units IQ-qualia and size-qualia. If you assign fruit flies a moral weight of 1, you are implicitly declaring a conversion rate of (to make up some random numbers) 1000 IQ-qualia = 1 size-qualia. If you assign elephants however to have a moral weight of 1, you implicitly declare a conversion rate of (again, made-up) 1 IQ-qualia = 1000 size-qualia, because elephant brains are much larger but not much smarter than fruit flies. These two different conversion rates are going to give you very different numbers for the moral weight of humans (or as Shulman was saying, of each other).
Rethink Priorities assigned humans a moral weight of 1, and thus assumed a certain conversion rate between different theories that made for a very small-animal-dominated world by sentience.
It is not unthinkably improbable that an elephant brain where reinforcement from a positive or negative stimulus adjust millions of times as many neural computations could be seen as vastly more morally important than a fruit fly, just as one might think that a fruit fly is much more important than a thermostat (which some suggest is conscious and possesses preferences). Since on some major functional aspects of mind there are differences of millions of times, that suggests a mean expected value orders of magnitude higher for the elephant if you put a bit of weight on the possibility that moral weight scales with the extent of, e.g. the computations that are adjusted by positive and negative stimuli.
This specific kind of account, if meant to depend inherently on differences in reinforcement, is very improbable to me (<0.1%), and conditional on such accounts, the inherent importance of reinforcement would also very probably scale very slowly, with faster scaling increasingly improbable. It could work out that the expected scaling isn’t slow, but that would be because of very low probability possibilities.
The value of subjective wellbeing, whether hedonistic, felt desires, reflective evaluation/preferences, choice-based or some kind of combination, seems very probably logically independent from how much reinforcement happens EDIT: and empirically dissociable. My main argument is that reinforcement happens unconsciously and has no necessary or ~immediate conscious effects. We could imagine temporarily or permanently preventing reinforcement without any effect on mental states or subjective wellbeing in the moment. Or, we can imagine connecting a brain to an artificial neural network to add more neurons to reinforce, again to no effect.
And even within the same human under normal conditions, holding their reports of value or intensity fixed, the amount of reinforcement that actually happens will probably depend systematically on the nature of the experience, e.g. physical pain vs anxiety vs grief vs joy. If reinforcement has a large effect on expected moral weights, you could and I’d guess would end up with an alienating view, where everyone is systematically wrong about the relative value of their own experiences. You’d effectively need to reweight all of their reports by type of experience.
So, even with intertheoretic comparisons between accounts with and without reinforcement, of which I’d be quite skeptical specifically in this case but also generally, this kind of hypothesis shouldn’t make much difference (or it does make a substantial difference, but it seems objectionably fanatical and alienating). If rejecting such intertheoretic comparisons, as I’m more generally inclined to do and as Open Phil seems to be doing, it should make very little difference.
There are more plausible functions you could use, though, like attention. But, again, I think the cases for intertheoretic comparisons between accounts of how moral value scales with neurons for attention or probably any other function are generally very weak, so you should only take expected values over descriptive uncertainty conditional on each moral scaling hypothesis, not across moral scaling hypotheses (unless you normalize by something else, like variance across options). Without intertheoretic comparisons, approaches to moral uncertainty in the literature aren’t so sensitive to small probability differences or fanatical about moral views. So, it tends to be more important to focus on large probability shifts than improbable extreme cases.
(I’m not at Rethink Priorities anymore, and I’m not speaking on their behalf.)
Rethink’s work, as I read it, did not address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around.
(...)
Rethink’s discussion of this almost completely sidestepped the issue in my view.
Thanks, I was referring to this as well, but should have had a second link for it as the Rethink page on neuron counts didn’t link to the other post. I think that page is a better link than the RP page I linked, so I’ll add it in my comment.
(Again, not speaking on behalf of Rethink Priorities, and I don’t work there anymore.)
(Btw, the quote formatting in your original comment got messed up with your edit.)
I think the claims I quoted are still basically false, though?
Rethink’s work, as I read it, did not address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around.
There’s a case that conscious subsystems could dominate expected welfare ranges even without intertheoretic comparisons (but also possibly with), so I think we were focusing on one of strongest and most important arguments for humans potentially mattering more, assuming hedonism and expectational total utilitarianism. Maximizing expected choiceworthiness with intertheoretic comparisons is controversial and only one of multiple competing approaches to moral uncertainty. I’m personally very skeptical of it because of the arbitrariness of intertheoretic comparisons and its fanaticism (including chasing infinities, and lexically higher and higher infinities). Open Phil also already avoids making intertheoretic comparisons, but was more sympathetic to normalizing by humans if it were going to.
I don’t want to convey that there was no discussion, thus my linking the discussion and saying I found it inadequate and largely missing the point from my perspective. I made an edit for clarity, but would accept suggestions for another.
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
Let me try to restate your point, and suggest why one may disagree. If one puts weight w on the welfare range (WR) of humans relative to that of chickens being N, and 1 - w on it being n, the expected welfare range of:
Humans relative to that of chickens is E(“WR of humans”/”WR of chickens”) = w*N + (1 - w)*n.
Chickens relative to that of humans is E(“WR of chickens”/”WR of humans”) = w/N + (1 - w)/n.
You are arguing that N can plausibly be much larger than n. For the sake of illustration, we can say N = 389 (ratio between the 86 billion neurons of a humans and 221 M of a chicken), n = 3.01 (reciprocal of RP’s median welfare range of chickens relative to humans of 0.332), and w = 1⁄12 (since the neuron count model was one of the 12 RP considered, and all of them were weighted equally). Having the welfare range of:
Chickens as the reference, E(“WR of humans”/”WR of chickens”) = 35.2. So 1/E(“WR of humans”/”WR of chickens”) = 0.0284.
Humans as the reference (as RP did), E(“WR of chickens”/”WR of humans”) = 0.305.
So, as you said, determining welfare ranges relative to humans results in animals being weighted more heavily. However, I think the difference is much smaller than the suggested above. Since N and n are quite different, I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above. If so, both approaches output exactly the same result:
E(“WR of humans”/”WR of chickens”) = N^w*n^(1 - w) = 4.49. So 1/E(“WR of humans”/”WR of chickens”) = (N^w*n^(1 - w))^-1 = 0.223.
E(“WR of chickens”/”WR of humans”) = (1/N)^w*(1/n)^(1 - w) = 0.223.
The reciprocal of the expected value is not the expected value of the reciprocal, so using the mean leads to different results. However, I think we should be using the geometric mean, and the reciprocal of the geometric mean is the geometric mean of the reciprocal. So the 2 approaches (using humans or chickens as the reference) will output the same ratios regardless of N, n and w as long as we aggregate N and n with the geometric mean. If N and n are similar, it no longer makes sense to use the geometric mean, but then both approaches will output similar results anyway, so RP’s approach looks fine to me as a 1st pass. Does this make any sense?
Of course, it would still be good to do further research (which OP could fund) to adjudicate how much weight should be given to each model RP considered.
I had argued for many years that insects met a lot of the functional standards one could use to identify the presence of well-being, and that even after taking two-envelopes issues and nervous system scale into account expected welfare at stake for small wild animals looked much larger than for FAW.
I’m not planning on continuing a long thread here, I mostly wanted to help address the questions about my previous comment, so I’ll be moving on after this. But I will say two things regarding the above. First, this effect (computational scale) is smaller for chickens but progressively enormous for e.g. shrimp or lobster or flies. Second, this is a huge move and one really needs to wrestle with intertheoretic comparisons to justify it:
I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above.
Suppose we compared the mass of the human population of Earth with the mass of an individual human. We could compare them on 12 metrics, like per capita mass, per capita square root mass, per capita foot mass… and aggregate mass. If we use the equal-weighted geometric mean, we will conclude the individual has a mass within an order of magnitude of the total Earth population, instead of billions of times less.
I’m not planning on continuing a long thread here, I mostly wanted to help address the questions about my previous comment, so I’ll be moving on after this.
Fair, as this is outside of the scope of the original post. I noticed you did not comment on RP’s neuron counts post. I think it would be valuable if you commented there about the concerns you expressed here, or did you already express them elsewhere in another post of RP’s moral weight project sequence?
First, this effect (computational scale) is smaller for chickens but progressively enormous for e.g. shrimp or lobster or flies.
I agree that is the case if one combines the 2 wildly different estimates for the welfare range (e.g. one based on the number of neurons, and another corresponding to RP’s median welfare ranges) with a weighted mean. However, as I commented above, using the geometric mean would cancel the effect.
Suppose we compared the mass of the human population of Earth with the mass of an individual human. We could compare them on 12 metrics, like per capita mass, per capita square root mass, per capita foot mass… and aggregate mass. If we use the equal-weighted geometric mean, we will conclude the individual has a mass within an order of magnitude of the total Earth population, instead of billions of times less.
Is this a good analogy? Maybe not:
Broadly speaking, giving the same weight to multiple estimates only makes sense if there is wide uncertainty with respect to which one is more reliable. In the example above, it would make sense to give negligible weight to all metrics except for the aggregate mass. In contrast, there is arguably wide uncertainty with respect to what are the best models to measure welfare ranges, and therefore distributing weights evenly is more appropriate.
One particular model on which we can put lots of weight on is that mass is straightforwardly additive (at least at the macro scale). So we can say the mass of all humans equals the number of humans times the mass per human, and then just estimate this for a typical human. In contrast, it is arguably unclear whether one can obtain the welfare range of an animal by e.g. just adding up the welfare range of its individual neurons.
It seems to me that the naive way to handle the two envelopes problem (and I’ve never heard of a way better than the naive way) is to diversify your donations across two possible solutions to the two envelopes problem:
donate half your (neartermist) money on the assumption that you should use ratios to fixed human value
donate half your money on the assumption that you should fix the opposite way (eg fruit flies have fixed value)
Which would suggest donating half to animal welfare and probably half to global poverty. (If you let moral weights be linear with neuron count, I think that would still favor animal welfare, but you could get global poverty outweighing animal welfare if moral weight grows super-linearly with neuron count.)
Plausibly there are other neartermist worldviews you might include that don’t relate to the two envelopes problem, e.g. a “only give to the most robust interventions” worldview might favor GiveDirectly. So I could see an allocation of less than 50% to animal welfare.
There is no one opposite way; there are many other ways than to fix human value. You could fix the value in fruit flies, shrimps, chickens, elephants, C elegans, some plant, some bacterium, rocks, your laptop, GPT-4 or an alien, etc..
I think a more principled approach would be to consider precise theories of how welfare scales, not necessarily fixing the value in any one moral patient, and then use some other approach to moral uncertainty for uncertainty between the theories. However, there is another argument for fixing human value across many such theories: we directly value our own experiences, and theorize about consciousness in relation to our own experiences, so we can fix the value in our own experiences and evaluate relative to them.
Here, you say, “Several of the grants we’ve made to Rethink Priorities funded research related to moral weights.” Yet in your initial response, you said, “We don’t use Rethink’s moral weights.” I respect your tapping out of this discussion, but at the same time I’d like to express my puzzlement as to why Open Phil would fund work on moral weights to inform grantmaking allocation, and then not take that work into account.
One can value research and find it informative or worth doing without being convinced of every view of a given researcher or team. Open Philanthropy also sponsored a contest to surface novel considerations that could affect its views on AI timelines and risk. The winners mostly present conclusions or considerations on which AI would be a lower priority, but that doesn’t imply that the judges or the institution changed their views very much in that direction.
At large scale, Information can be valuable enough to buy even if it only modestly adjusts proportional allocations of effort, the minimum bar for funding a research project with hundreds of thousands or millions of dollars presumably isn’t that one pivots billions of dollars on the results with near-certainty.
Thank you for engaging. I don’t disagree with what you’ve written; I think you have interpreted me as implying something stronger than what I intended, and so I’ll now attempt to add some colour.
That Emily and other relevant people at OP have not fully adopted Rethink’s moral weights does not puzzle me. As you say, to expect that is to apply an unreasonably high funding bar. I am, however, puzzled that Emily and co. appear to have not updated at all towards Rethink’s numbers. At least, that’s the way I read:
If OP has not updated at all towards Rethink’s numbers, then I see three possible explanations, all of which I find unlikely, hence my puzzlement. First possibility: the relevant people at OP have not yet given the Rethink report a thorough read, and have therefore not updated. Second: the relevant OP people have read the Rethink report, and have updated their internal models, but have not yet gotten around to updating OP’s actual grantmaking allocation. Third: OP believes the Rethink work is low quality or otherwise critically corrupted by one or more errors. I’d be very surprised if one or two are true, given how moral weight is arguably the most important consideration in neartermist grantmaking allocation. I’d also be surprised if three is true, given how well Rethink’s moral weight sequence has been received on this forum (see, e.g., comments here and here).[1] OP people may disagree with Rethink’s approach at the independent impression level, but surely, given Rethink’s moral weights work is the most extensive work done on this topic by anyone(?), the Rethink results should be given substantial weight—or at least non-trivial weight—in their all-things-considered views?
(If OP people believe there are errors in the Rethink work that render the results ~useless, then, considering the topic’s importance, I think some sort of OP write-up would be well worth the time. Both at the object level, so that future moral weight researchers can avoid making similar mistakes, and to allow the community to hold OP’s reasoning to a high standard, and also at the meta level, so that potential donors can update appropriately re. Rethink’s general quality of work.)
Additionally—and this is less important, I’m puzzled at the meta level at the way we’ve arrived here. As noted in the top-level post, Open Phil has been less than wholly open about its grantmaking, and it’s taken a pretty not-on-the-default-path sequence of events—Ariel, someone who’s not affiliated with OP and who doesn’t work on animal welfare for their day job, writing this big post; Emily from OP replying to the post and to a couple of the comments; me, a Forum-goer who doesn’t work on animal welfare, spotting an inconsistency in Emily’s replies—to surface the fact that OP does not give Rethink’s moral weights any weight.
Edited to add: Carl has left a detailed reply below, and it seems that three is, in fact, what has happened.
Fair points, Carl. Thanks for elaborating, Will!
Interestingly and confusingly, fitting distributions to Luke’s 2018 guesses for the 80 % prediction intervals of the moral weight of various species, one gets mean moral weights close to or larger than 1:
It is also worth noting that Luke seemed very much willing to update on further research in 2022. Commenting on the above, Luke said (emphasis mine):
Welfare ranges are a crucial input to determining moral weights, so I assume Luke would also have agreed that it would not have been that hard to produce more reasonable welfare ranges than his and Open Phil’s in 2022. So, given how little time Open Phil seemingly devoted to assessing welfare ranges in comparison to Rethink, I would have expected Open Phil to give major weight to Rethink’s values.
I can’t speak for Open Philanthropy, but I can explain why I personally was unmoved by the Rethink report (and think its estimates hugely overstate the case for focusing on tiny animals, although I think the corrected version of that case still has a lot to be said for it).
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
[Edited for clarity] I was not satisfied with Rethink’s attempt to address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around.
It is not unthinkably improbable that an elephant brain where reinforcement from a positive or negative stimulus adjust millions of times as many neural computations could be seen as vastly more morally important than a fruit fly, just as one might think that a fruit fly is much more important than a thermostat (which some suggest is conscious and possesses preferences). Since on some major functional aspects of mind there are differences of millions of times, that suggests a mean expected value orders of magnitude higher for the elephant if you put a bit of weight on the possibility that moral weight scales with the extent of, e.g. the computations that are adjusted by positive and negative stimuli. A 1% weight on that plausible hypothesis means the expected value of the elephant is immense vs the fruit fly. So there will be something that might get lumped in with ‘overwhelming hierarchicalism’ in the language of the top-level post. Rethink’s various discussions of this issue in my view missed the mark.
Go the other way and fix the value of the elephant at 1, and the possibility that value scales with those computations is treated as a case where the fly is worth ~0. Then a 1% or even 99% credence in value scaling with computation has little effect, and the elephant-fruit fly ratio is forced to be quite high so tiny mind dominance is almost automatic. The same argument can then be used to make a like case for total dominance of thermostat-like programs, or individual neurons, over insects. And then again for individual electrons.
As I see it, Rethink basically went with the ‘ratios to fixed human value’, so from my perspective their bottom-line conclusions were predetermined and uninformative. But the alternatives they ignore lead me to think that the expected value of welfare for big minds is a lot larger than for small minds (and I think that can continue, e.g. giant AI minds with vastly more reinforcement-affected computations and thoughts could possess much more expected welfare than humans, as many humans might have more welfare than one human).
I agree with Brian Tomasik’s comment from your link:
By the same token, arguments about the number of possible connections/counterfactual richness in a mind could suggest superlinear growth in moral importance with computational scale. Similar issues would arise for theories involving moral agency or capacity for cooperation/game theory (on which humans might stand out by orders of magnitude relative to elephants; marginal cases being socially derivative), but those were ruled out of bounds for the report. Likewise it chose not to address intertheoretic comparisons and how those could very sharply affect the conclusions. Those are the kinds of issues with the potential to drive massive weight differences.
I think some readers benefitted a lot from reading the report because they did not know that, e.g. insects are capable of reward learning and similar psychological capacities. And I would guess that will change some people’s prioritization between different animals, and of animal vs human focused work. I think that is valuable. But that information was not new to me, and indeed I had argued for many years that insects met a lot of the functional standards one could use to identify the presence of well-being, and that even after taking two-envelopes issues and nervous system scale into account expected welfare at stake for small wild animals looked much larger than for FAW.
I happen to be a fan of animal welfare work relative to GHW’s other grants at the margin because animal welfare work is so highly neglected (e.g. Open Philanthropy is a huge share of world funding on the most effective FAW work but quite small compared to global aid) relative to the case for views on which it’s great. But for me Rethink’s work didn’t address the most important questions, and largely baked in its conclusions methodologically.
Thanks for your discussion of the Moral Weight Project’s methodology, Carl. (And to everyone else for the useful back-and-forth!) We have some thoughts about this important issue and we’re keen to write more about it. Perhaps 2024 will provide the opportunity!
For now, we’ll just make one brief point, which is that it’s important to separate two questions. The first concerns the relevance of the two envelopes problem to the Moral Weight Project. The second concerns alternative ways of generating moral weights. We considered the two envelopes problem at some length when we were working on the Moral Weight Project and concluded that our approach was still worth developing. We’d be glad to revisit this and appreciate the challenge to the methodology.
However, even if it turns out that the methodology has issues, it’s an open question how best to proceed. We grant the possibility that, as you suggest, more neurons = more compute = the possibility of more intense pleasures and pains. But it’s also possible that more neurons = more intelligence = less biological need for intense pleasures and pains, as other cognitive abilities can provide the relevant fitness benefits, effectively muting the intensities of those states. Or perhaps there’s some very low threshold of cognitive complexity for sentience after which point all variation in behavior is due to non-hedonic capacities. Or perhaps cardinal interpersonal utility comparisons are impossible. And so on. In short, while it’s true that there are hypotheses on which elephants have massively more intense pains than fruit flies, there are also hypotheses on which the opposite is true and on which equality is (more or less) true. Once we account for all these hypotheses, it may still work out that elephants and fruit flies differ by a few orders of magnitude in expectation, but perhaps not by five or six. Presumably, we should all want some approach, whatever it is, that avoids being mugged by whatever low-probability hypothesis posits the largest difference between humans and other animals.
That said, you’ve raised some significant concerns about methods that aggregate over different relative scales of value. So, we’ll be sure to think more about the degree to which this is a problem for the work we’ve done—and, if it is, how much it would change the bottom line.
Thank you for the comment Bob.
I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones). E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of the same set of robots and computations with less integration (and overlap of shared features, or top-level control), ceteris paribus.
Indeed, I would be wary of treating the integrated mind as though welfare stakes for it were half or a tenth as great, seeing that as a potential source of moral catastrophe, like ignoring the welfare of minds not based on proteins. E.g. having tasks involving suffering and frustration done by large integrated minds, and pleasant ones done by tiny minds, while increasing the amount of mental activity in the former. It sounds like the combination of object-level and methodological takes attached to these reports would favor ignoring almost completely the integrated mind.
Incidentally, in a world where small animals are being treated extremely badly and are numerous, I can see a temptation to err in their favor, since even overestimates of their importance could be shifting things in the right marginal policy direction. But thinking about the potential moral catastrophes on the other side helps sharpen the motivation to get it right.
In practice, I don’t prioritize moral weights issues in my work, because I think the most important decisions hinging on it will be in an era with AI-aided mature sciences of mind, philosophy and epistemology. And as I have written regardless of your views about small minds and large minds, it won’t be the case that e.g. humans are utility monsters of impartial hedonism (rather than something bigger, smaller, or otherwise different), and grounds for focusing on helping humans won’t be terminal impartial hedonistic in nature. But from my viewpoint baking in that integration (and unified top-level control or mental overlap of some parts of computation) close to eliminates mentality or welfare (vs less integrated collections of computations) seems bad in non-Pascalian fashion.
(Speaking for myself only.)
FWIW, I think something like conscious subsystems (in huge numbers in one neural network) is more plausible by design in future AI. It just seems unlikely in animals because all of the apparent subjective value seems to happen at roughly the highest level where everything is integrated in an animal brain.
Felt desire seems to (largely) be motivational salience, a top-down/voluntary attention control function driven by high-level interpretations of stimuli (e.g. objects, social situations), so relatively late in processing. Similarly, hedonic states depend on high-level interpretations, too.
Or, according to Attention Schema Theory, attention models evolved for the voluntary control of attention. It’s not clear what the value would be for an attention model at lower levels of organization before integration.
And evolution will select against realizing functions unnecessarily if they have additional costs, so we should provide a positive argument for the necessary functions being realized earlier or multiple times in parallel that overcomes or doesn’t incur such additional costs.
So, it’s not that integration necessarily reduces value; it’s that, in animals, all the morally valuable stuff happens after most of the integration, and apparently only once or in small number.
In artificial systems, the morally valuable stuff could instead be implemented separately by design at multiple levels.
EDIT:
I think there’s still crux about whether realizing the same function the same number of times but “to a greater degree” makes it more morally valuable. I think there are some ways of “to a greater degree” that don’t matter, and some that could. If it’s only sort of (vaguely) true that a system is realizing a certain function, or it realizes some but not all of the functions possibly necessary for some type of welfare in humans, then we might discount it for only meeting lower precisifications of the vague standards. But adding more neurons just doing the same things:
doesn’t make it more true that it realizes the function or the type of welfare (e.g. adding more neurons to my brain wouldn’t make it more true that I can suffer),
doesn’t clearly increase welfare ranges, and
doesn’t have any other clear reason for why it should make a moral difference (I think you disagree with this, based on your examples).
But maybe we don’t actually need good specific reasons to assign non-tiny probabilities to neuron count scaling for 2 or 3, and then we get domination of neuron count scaling in expectation, depending on what we’re normalizing by, like you suggest.
This consideration is something I had never thought of before and blew my mind. Thank you for sharing.
Hopefully I can summarize it (assuming I interpreted it correctly) in a different way that might help people who were as befuddled as I was.
The point is that, when you have probabilistic weight to two different theories of sentience being true, you have to assign units to sentience in these different theories in order to compare them.
Say you have two theories of sentience that are similarly probable, one dependent on intelligence and one dependent on brain size. Call these units IQ-qualia and size-qualia. If you assign fruit flies a moral weight of 1, you are implicitly declaring a conversion rate of (to make up some random numbers) 1000 IQ-qualia = 1 size-qualia. If you assign elephants however to have a moral weight of 1, you implicitly declare a conversion rate of (again, made-up) 1 IQ-qualia = 1000 size-qualia, because elephant brains are much larger but not much smarter than fruit flies. These two different conversion rates are going to give you very different numbers for the moral weight of humans (or as Shulman was saying, of each other).
Rethink Priorities assigned humans a moral weight of 1, and thus assumed a certain conversion rate between different theories that made for a very small-animal-dominated world by sentience.
This specific kind of account, if meant to depend inherently on differences in reinforcement, is very improbable to me (<0.1%), and conditional on such accounts, the inherent importance of reinforcement would also very probably scale very slowly, with faster scaling increasingly improbable. It could work out that the expected scaling isn’t slow, but that would be because of very low probability possibilities.
The value of subjective wellbeing, whether hedonistic, felt desires, reflective evaluation/preferences, choice-based or some kind of combination, seems very probably logically independent from how much reinforcement happens EDIT: and empirically dissociable. My main argument is that reinforcement happens unconsciously and has no necessary or ~immediate conscious effects. We could imagine temporarily or permanently preventing reinforcement without any effect on mental states or subjective wellbeing in the moment. Or, we can imagine connecting a brain to an artificial neural network to add more neurons to reinforce, again to no effect.
And even within the same human under normal conditions, holding their reports of value or intensity fixed, the amount of reinforcement that actually happens will probably depend systematically on the nature of the experience, e.g. physical pain vs anxiety vs grief vs joy. If reinforcement has a large effect on expected moral weights, you could and I’d guess would end up with an alienating view, where everyone is systematically wrong about the relative value of their own experiences. You’d effectively need to reweight all of their reports by type of experience.
So, even with intertheoretic comparisons between accounts with and without reinforcement, of which I’d be quite skeptical specifically in this case but also generally, this kind of hypothesis shouldn’t make much difference (or it does make a substantial difference, but it seems objectionably fanatical and alienating). If rejecting such intertheoretic comparisons, as I’m more generally inclined to do and as Open Phil seems to be doing, it should make very little difference.
There are more plausible functions you could use, though, like attention. But, again, I think the cases for intertheoretic comparisons between accounts of how moral value scales with neurons for attention or probably any other function are generally very weak, so you should only take expected values over descriptive uncertainty conditional on each moral scaling hypothesis, not across moral scaling hypotheses (unless you normalize by something else, like variance across options). Without intertheoretic comparisons, approaches to moral uncertainty in the literature aren’t so sensitive to small probability differences or fanatical about moral views. So, it tends to be more important to focus on large probability shifts than improbable extreme cases.
(I’m not at Rethink Priorities anymore, and I’m not speaking on their behalf.)
RP did in fact respond to some versions of these arguments, in the piece Do Brains Contain Many Conscious Subsystems? If So, Should We Act Differently?, of which I am a co-author.
Thanks, I was referring to this as well, but should have had a second link for it as the Rethink page on neuron counts didn’t link to the other post. I think that page is a better link than the RP page I linked, so I’ll add it in my comment.
(Again, not speaking on behalf of Rethink Priorities, and I don’t work there anymore.)
(Btw, the quote formatting in your original comment got messed up with your edit.)
I think the claims I quoted are still basically false, though?
Do Brains Contain Many Conscious Subsystems? If So, Should We Act Differently? explicitly considered a conscious subsystems version of this thought experiment, focusing on the more human-favouring side when you normalize by small systems like insect brains, which is the non-obvious side often neglected.
There’s a case that conscious subsystems could dominate expected welfare ranges even without intertheoretic comparisons (but also possibly with), so I think we were focusing on one of strongest and most important arguments for humans potentially mattering more, assuming hedonism and expectational total utilitarianism. Maximizing expected choiceworthiness with intertheoretic comparisons is controversial and only one of multiple competing approaches to moral uncertainty. I’m personally very skeptical of it because of the arbitrariness of intertheoretic comparisons and its fanaticism (including chasing infinities, and lexically higher and higher infinities). Open Phil also already avoids making intertheoretic comparisons, but was more sympathetic to normalizing by humans if it were going to.
I don’t want to convey that there was no discussion, thus my linking the discussion and saying I found it inadequate and largely missing the point from my perspective. I made an edit for clarity, but would accept suggestions for another.
Your edit looks good to me. Thanks!
Thanks for elaborating, Carl!
Let me try to restate your point, and suggest why one may disagree. If one puts weight w on the welfare range (WR) of humans relative to that of chickens being N, and 1 - w on it being n, the expected welfare range of:
Humans relative to that of chickens is E(“WR of humans”/”WR of chickens”) = w*N + (1 - w)*n.
Chickens relative to that of humans is E(“WR of chickens”/”WR of humans”) = w/N + (1 - w)/n.
You are arguing that N can plausibly be much larger than n. For the sake of illustration, we can say N = 389 (ratio between the 86 billion neurons of a humans and 221 M of a chicken), n = 3.01 (reciprocal of RP’s median welfare range of chickens relative to humans of 0.332), and w = 1⁄12 (since the neuron count model was one of the 12 RP considered, and all of them were weighted equally). Having the welfare range of:
Chickens as the reference, E(“WR of humans”/”WR of chickens”) = 35.2. So 1/E(“WR of humans”/”WR of chickens”) = 0.0284.
Humans as the reference (as RP did), E(“WR of chickens”/”WR of humans”) = 0.305.
So, as you said, determining welfare ranges relative to humans results in animals being weighted more heavily. However, I think the difference is much smaller than the suggested above. Since N and n are quite different, I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above. If so, both approaches output exactly the same result:
E(“WR of humans”/”WR of chickens”) = N^w*n^(1 - w) = 4.49. So 1/E(“WR of humans”/”WR of chickens”) = (N^w*n^(1 - w))^-1 = 0.223.
E(“WR of chickens”/”WR of humans”) = (1/N)^w*(1/n)^(1 - w) = 0.223.
The reciprocal of the expected value is not the expected value of the reciprocal, so using the mean leads to different results. However, I think we should be using the geometric mean, and the reciprocal of the geometric mean is the geometric mean of the reciprocal. So the 2 approaches (using humans or chickens as the reference) will output the same ratios regardless of N, n and w as long as we aggregate N and n with the geometric mean. If N and n are similar, it no longer makes sense to use the geometric mean, but then both approaches will output similar results anyway, so RP’s approach looks fine to me as a 1st pass. Does this make any sense?
Of course, it would still be good to do further research (which OP could fund) to adjudicate how much weight should be given to each model RP considered.
True!
Thanks for sharing your views!
I’m not planning on continuing a long thread here, I mostly wanted to help address the questions about my previous comment, so I’ll be moving on after this. But I will say two things regarding the above. First, this effect (computational scale) is smaller for chickens but progressively enormous for e.g. shrimp or lobster or flies. Second, this is a huge move and one really needs to wrestle with intertheoretic comparisons to justify it:
Suppose we compared the mass of the human population of Earth with the mass of an individual human. We could compare them on 12 metrics, like per capita mass, per capita square root mass, per capita foot mass… and aggregate mass. If we use the equal-weighted geometric mean, we will conclude the individual has a mass within an order of magnitude of the total Earth population, instead of billions of times less.
Fair, as this is outside of the scope of the original post. I noticed you did not comment on RP’s neuron counts post. I think it would be valuable if you commented there about the concerns you expressed here, or did you already express them elsewhere in another post of RP’s moral weight project sequence?
I agree that is the case if one combines the 2 wildly different estimates for the welfare range (e.g. one based on the number of neurons, and another corresponding to RP’s median welfare ranges) with a weighted mean. However, as I commented above, using the geometric mean would cancel the effect.
Is this a good analogy? Maybe not:
Broadly speaking, giving the same weight to multiple estimates only makes sense if there is wide uncertainty with respect to which one is more reliable. In the example above, it would make sense to give negligible weight to all metrics except for the aggregate mass. In contrast, there is arguably wide uncertainty with respect to what are the best models to measure welfare ranges, and therefore distributing weights evenly is more appropriate.
One particular model on which we can put lots of weight on is that mass is straightforwardly additive (at least at the macro scale). So we can say the mass of all humans equals the number of humans times the mass per human, and then just estimate this for a typical human. In contrast, it is arguably unclear whether one can obtain the welfare range of an animal by e.g. just adding up the welfare range of its individual neurons.
It seems to me that the naive way to handle the two envelopes problem (and I’ve never heard of a way better than the naive way) is to diversify your donations across two possible solutions to the two envelopes problem:
donate half your (neartermist) money on the assumption that you should use ratios to fixed human value
donate half your money on the assumption that you should fix the opposite way (eg fruit flies have fixed value)
Which would suggest donating half to animal welfare and probably half to global poverty. (If you let moral weights be linear with neuron count, I think that would still favor animal welfare, but you could get global poverty outweighing animal welfare if moral weight grows super-linearly with neuron count.)
Plausibly there are other neartermist worldviews you might include that don’t relate to the two envelopes problem, e.g. a “only give to the most robust interventions” worldview might favor GiveDirectly. So I could see an allocation of less than 50% to animal welfare.
There is no one opposite way; there are many other ways than to fix human value. You could fix the value in fruit flies, shrimps, chickens, elephants, C elegans, some plant, some bacterium, rocks, your laptop, GPT-4 or an alien, etc..
I think a more principled approach would be to consider precise theories of how welfare scales, not necessarily fixing the value in any one moral patient, and then use some other approach to moral uncertainty for uncertainty between the theories. However, there is another argument for fixing human value across many such theories: we directly value our own experiences, and theorize about consciousness in relation to our own experiences, so we can fix the value in our own experiences and evaluate relative to them.