Thanks for reporting this. You found an issue that occurred when we converted data from years to hours and somehow overlooked the place in the code where that was generated. It is fixed now. The intended range is half a minute to 37 minutes, with a mean of a little under 10. I’m not entirely sure where the exact numbers for that parameter come from, since Laura Duffy produced that part of the model and has moved on to another org, but I believe it is inspired by this report. As you point out, that is less than three hours of disabling equivalent pain. I’ll have to dig deeper to figure out the rationale here.
Derek Shiller
After working on WIT, I’ve grown a lot more comfortable producing provisional answers to deep questions. In similar academic work, there are strong incentives to only try to answer questions in ways that are fully defensible: if there is some other way of going about it that gives a different result, you need to explain why your way is better. For giant nebulous questions, this means we will make very slow progress on finding a solution. Since these questions can be very important, it is better to come up with some imperfect answers rather than just working on simpler problems. WIT tries to tackle big important nebulous problems, and we have to sometimes make questionable assumptions to do so. The longer I’ve spent here, the more worthwhile our approach feels to me.
One of the big prioritization changes I’ve taken away from our tools is within longtermism. Playing around with our Cross-Cause Cost-Effectiveness Model, it was clear to me that so much of the expected value of the long-term future comes from the direction we expect it to take, rather than just whether it happens at all. If you can shift that direction a little bit, it makes a huge difference to overall value. I no longer think that extinction risk work is the best kind of intervention if you’re worried about the long-term future. I tend to think that AI (non-safety) policy work is more impactful in expectation, if we worked through all of the details.
Thanks for raising this point. We think that choosing the right decision theory that can handle imprecise probabilities is a complex issue that has not been adequately resolved. We take the point that Mogensen’s conclusions have radical implications for the EA community at large and we haven’t formulated a compelling story about where Mogensen goes wrong. However, we also believe that there are likely to be solutions that will most likely avoid those radical implications, and so we don’t need to bracket all of the cause prioritization work until we find them. Our tools may only be useful to those who see there to be work done on cause prioritization.
As a practical point, our Cross-Cause Cost-Effectiveness Model handles precise probabilities with Monte Carlo methods by randomly selecting individual values for parameters in each outcome from a distribution. We noted hesitance about enforcing a specific distribution over our range of radical uncertainty, but we stand behind this as a reasonable choice given our pragmatic aims. If the alternative is not to try to calculate relative expected values, we think that would be a loss, even if our own results have methodological doubts still attached to them.
We appreciate your perspective; it provides us with a chance to clarify our goals. The case you refer to was intended as an example of the ways in which normative uncertainty matters and we did not mean for the views there accurately model real-world moral dilemmas or the span of reasonable responses to them.
However, you might also object that we don’t really make it possible to incorporate the intrinsic valuing of natural environments in our moral parliament tool. Some might see this as an oversight. Others might be concerned about other missing subjects of human concern: respect for God, proper veneration of our ancestors, aesthetic value, etc. We didn’t design the tool to encompass the full range of human values, but to reflect the major components of the values of the EA community (which is predominantly consequentialist and utilitarian). It is beyond the scope of this project to assess whether those values should be exhaustive. That said, we don’t think strict attachment to the values in the tool are necessary for deriving insights from it, and we think it models approaches to normative uncertainty well even if it doesn’t capture the full range of the subjects of human normative uncertainty.
AMA: Rethink Priorities’ Worldview Investigation Team
Rethink Priorities’ Moral Parliament Tool
Rethink Priorities’ Portfolio Builder Tool
For an intervention to be a longtermist priority, there needs to be some kind of concrete story for how it improves the long-term future.
I disagree with this. With existential risk from unaligned AI, I don’t think anyone has ever told a very clear story about how AI will actually get misaligned, get loose, and kill everyone. People have speculated about components of the story, but generally not in a super concrete way, and it isn’t clear how standard AI safety research would address a very specific disaster scenario. I don’t think this is a problem: we shouldn’t expect to know all the details of how things go wrong in advance, and it is worthwhile to do a lot of preparatory research that might be helpful so that we’re not fumbling through basic things during a critical period. I think the same applies to digital minds.
Your points here do not engage with the argument, made by @Zach Stein-Perlman early on in the week, that we can just punt solving AI welfare to the future (i.e., to the long reflection / to once we have aligned superintelligent advisors), and in the meantime continue focusing our resources on AI safety (i.e., on raising the probability that we make it to a long reflection).
I think this viewpoint is overly optimistic about the probability of locking in / the relevance of superintelligent advisors. I discuss some of the issues of locking in in a contribution to the debate week. In brief, I think that it is possible that digital minds will be sufficiently integrated in the next few decades that they will have power in social relationships that will be extremely difficult to disentangle. I also think that AGI may be useful in drawing inferences from our assumptions, but won’t be particularly helpful at setting the right assumptions.
I generally agree that the formal thesis for the debate week set a high bar that is difficult to defend and I think that this is a good statement of the case for that. Even if you think that AI welfare is important (which I do!), the field doesn’t have the existing talent pipelines or clear strategy to absorb $50 million in new funding each year. Putting that much in over the next few years could easily make things worse. It is also possible that AI welfare has the potential for non-EA money and it should aim for that rather than try to take money that would otherwise go to EA cause areas.
That said, there are other points that I disagree with:
It is not good enough to simply say that an issue might have a large scale impact and therefore think it should be an EA priority, it is not good enough to simply defer to Carl Shulman’s views if you yourself can’t argue why you think it’s “pretty likely… that there will be vast numbers of AIs that are smarter than us” and why those AIs deserve moral consideration.
I think that this is wrong. The fact that something might have a huge scale and we might be able to do something about it is enough for it to be taken seriously and provides prima facie evidence that it should be a priority. I think it is vastly preferrable to preempt problems before they occur rather than try to fix them once they have. For one, AI welfare is a very complicated topic that will take years or decades to sort out. AI persons (or things that look like AI persons) could easily be here in the next decade. If we don’t start thinking about it soon, then we may be years behind when it happens.
AI people (of some form or other) are not exactly a purely hypothetical technology, and the epistemic case for them doesn’t seem fundamentally different from the case for thinking that AI safety will be an existential issue in the future, that the average intensively farmed animal leads a net-negative life, or that any given global health intervention won’t have significant unanticipated negative side effects. We’re dealing with deep uncertainties no matter what we do.
Additionally, it might be much harder to try to lobby for changes once things have gone wrong. I wish some groups were actively lobbying against intensified animal agriculture in the 1930s (or the 1880s). It may not have been tractable. It may not have been clear, but it may have been possible to outlaw some terrible practices before they were adopted. We might have that opportunity now with AI welfare. Perhaps this means that we only need a small core group, but I do think some people should make it a priority.
I stick by my intuition, but it is really just an intuition about how human behavior. Perhaps some people would be completely unbothered in that situation. Perhaps most would. (I actually find that itself worrisome in a different way, because it suggests that people may easily overlook AI wellbeing. Perhaps you have the right reasons for happily ignoring their anguished cries, but not everyone will.) This is an empirical question, really, and I don’t think we’ll know how people will react until it happens.
How could they not be conscious?
It is rare for theories of consciousness to make any demands on motivational structure.
Global workspace theory, for instance, says that consciousness depends on having a central repository by which different cognitive modules talk to each other. If the modules were to directly communicate point to point, there would be no conscious experiences (by that theory). I see no reason in that case why decision making would have to rely on different mechanisms.
Higher order theories suggest that consciousness depends on having representations of our own mental states. A creature could have all sorts of direct concerns that it never reflected on, and these could look a lot like ours.
IIT suggests that you could have a high level duplicate of a conscious system that was unconscious due to the fine grained details.
Etc.
The specific things you need to change in the robots to render them not conscious depends on your theory, but I don’t think you need to go quite so far as to make them a lookup table or an transformer.
My impression was that you like theories that stress the mechanisms behind our judgments of the weirdness of consciousness as critical to conscious experiences. I could imagine a robot just like us but totally non-introspective, lacking phenomenal concepts, etc. Would you think such a thing was conscious? Could it not desire things in something like the way we do?There’s another question about whether I’d actually dissect one, and maybe I still wouldn’t, but this could be for indirect or emotional reasons. It could still be very unpleasant or even traumatic for me to dissect something that cries out and against the desperate pleas of its mother. Or, it could be bad to become less sensitive to such responses, when such responses often are good indicators of risk of morally significant harm. People who were confident nonhuman animals don’t matter in themselves sometimes condemned animal cruelty for similar reasons.
This supports my main argument. If you value conscious experience these emotional reasons could be concerning for the long term future. It seems like a slippery slope from being nice to them because we find it more pleasant to thinking that they are moral patients, particularly if we frequently interact with them. It is possible that our generation will never stop caring about consciousness, but if we’re not careful, our children might.
Rethink Priorities’ Digital Consciousness Project Announcement
This case is interesting, but I think it touches on a slightly different issue. The symbolic presumably doesn’t care about their pretend pain. There is a more complicated story about their actions that involves their commitment to the ruse. In the robot case, I assume we’re supposed to imagine that the robots care about each other to whatever extent that unconscious things can. Their motivational structure is close to ours.
I think the case is less clear if we build up the extent to which the asymbolic child really wants the painkillers. If they constantly worry about not getting them, if they are willing to sacrifice lots of other things they care about to secure them (even though they know that it won’t help them avoid pain), etc. I’m less inclined to think the case is clear cut.
I agree! I’m used to armchair reflection, but this is really an empirical question. So much of the other discussion this week has focused on sentience. It would be good to get a sense if this wasn’t the crux for the public.
Thanks Richard!
If there’s nothing “all that important” about the identified pattern, whyever would we have identified it as the correct theory of consciousness to begin with?
This particular argument really speaks to the more radical physicalists. I don’t think you should be that moved by it. If I were in your shoes (rather than undecided), I think I’d be more worried that people would come to jettison their concern for consciousness for bad reasons.
One reason to reject this inference is if we accept the phenomenal intentionality thesis that consciousness is necessary for having genuinely representational states (including desires and preferences). I agree that consciousness need not be what’s represented as our goal-state; but it may still be a necessary background condition for us to have real goals at all (in contrast to the pseudo-intentionality of mere thermostats and the like).
One case I had in mind while writing this was the matter of unconscious desires in a conscious person. Suppose that we have some desires that shape our motivations but which we never think about. Maybe we have a desire to be near the ocean. We don’t feel any longing, we just find ourselves quickly accepting invitations to the beach. (We also aren’t glad to receive such invitations or any happier when at the beach.) Satisfying that desire seems to me not to count for much in a large part because it has no impact on our conscious states. Would you agree? If so, would you think the intentionality thesis can make sense of this difference? Do you want to withhold intentionality from purely unconscious states in a conscious mind? Or is there a different story you would tell?
I think there is a difference between what people would say about the case and what they would do if actually in it. The question of what people would say is interesting—I’m curious how your polling goes. But it is easier to support an intellectual stance when you’re not confronted by the ramifications of your choice. (Of course, I can also see it going the other way, if we think the ramifications of your choice would harm you instead of the robot.)
The Value of Consciousness as a Pivotal Question
I think it is valuable to have this stuff on record. If it isn’t recorded anywhere, then anyone who wants to reference this position in another academic work—even if it is the consensus within a field—is left presenting it in a way that makes it look like their personal opinion.
We have heard from some organizations that have taken a close look at the CCM and it has spawned some back and forth about the takeaways. I don’t think I can disclose anything specific further at this point, though perhaps we might be able to in the future.