Why s-risks are the worst existential risks, and how to prevent them

Link post

Effective altruists focussed on shaping the far future face a choice between different types of interventions. Of these, efforts to reduce the risk of human extinction have received the most attention so far. In this talk, Max Daniel makes the case that we may want to complement such work with interventions aimed at preventing very undesirable futures (“s-risks”), and that this provides a reason for, among the sources of existential risk identified so far, focussing on AI risk.

Transcript: Why S-risks are the worst existential risks, and how to prevent them

I’m going to talk about risks of large scale severe suffering in the far future or S-Risks. And to illustrate what S-Risks are about, I’d like to start with a fictional story from the British TV series Black Mirror, which some of you may have seen. So, in this fictional scenario, it’s possible to upload human minds into virtual environments. In this way, sentient beings can effectively be stored and run on very small computing devices, such as the wide-act shaped gadget you can see on the screen here. Behind the computing device, you can see Matt.

Matt’s job is to sell those virtual humans as virtual assistants. And because this isn’t a job description that’s particularly appealing to everyone, part of Matt’s job is to convince these human uploads to actually comply with the request of their human owners. And so in this instance, human upload Greta, which you can see here, is unwilling to do this. She’s not thrilled with the prospect of serving for the rest of her life as a virtual assistant. In order to break her will, in order to make her comply, Matt increases the rate at which time passes for greater. So, while Matt only needs to wait for a few seconds, during that time, Greta effectively endures many months of solitary confinement.

So, I hope you agree that this would be an undesirable scenario. And now, fortunately, of course, this particular scenario is quite unlikely to be realized. So, for any particular scenario we can imagine to unfold, it’s pretty unlikely that it will be realized in precisely this form. So that’s not the point here. However, I’ll argue that there in fact is a broad range of scenarios that we face risks of scenarios that are in some ways like this scenario or even worse.

And I’ll call these risks S-risks. I’ll first explain what these S-risks are, contrasting them with the more familiar existential risks or X-risks. And then in a second part of the talk, I’ll talk a bit about why as effective altruists we may want to prevent those S-risks and how we could do this. So, the way I’d like to introduce S-risks will be a subclass of the more familiar existential risks.

As you may recall, these have been defined by Nick Bostrom as risks where an adverse outcome would either completely annihilate Earth-originating intelligent life or at least permanently and drastically curtail its potential. Bostrom also suggested in one of his major publications on existential risk that one way to understand how these risks differ from other kinds of risks is to look at how bad this adverse outcome would be along two dimensions. And these dimensions are the scope and the severity of the adverse outcome that we’re worried about. I’ve reproduced one of Bostrom’s central figures here.

You can see a risk scope on the vertical axis. That is, here we ask how many individuals would be negatively affected if the risks were realized? Is it just a small number of individuals? Is it everyone in a particular region or even everyone alive on Earth? Or in the worst case, everyone alive on Earth plus some or even all future generations? And the second relevant dimension is the severity. So here we ask for each individual that would be affected, how bad would the outcome be? For instance, consider a fatal car crash, the risk of a single fatal car crash. If this happened, this would be pretty bad. You could die, so it would have pretty high severity, but it would only have personal scope because in any single car crash, only a small number of individuals are affected. However, there are other risks which would have even worse severity. For instance, consider factory farming. We commonly believe that, for instance, the life of say, chickens and battery cages is so bad that it would be better to not bring these chickens into existence in the first place. That’s why we believe that it’s a good thing that most of the food at this conference is vegan. Another way to look at this is that, I guess, some of you would think that the prospect of being tortured for the rest of your life probably would be even worse than a fatal car crash. So, there can be risks which even worse severity than terminal risks such as a fatal car crash. And as risks are now risks which with respect to their severity are about as bad as factory farming and that they are about outcomes that would be even worse than non-existence, but would also have much greater scope than a car crash or even factory farming. And that they could potentially affect a very large number of beings for the entire far future across the whole universe.

So, this explains why in the title of the talk I claimed that S-risks are the worst existential risks. I said this because I just defined them to be risks of outcomes which have the worst possible severity and the worst possible scope. So, one way to understand this and how they differ from other kinds of existential risk is to zoom in on the top right corner of the figure I showed before. This is the corner that shows existential risks.

These are risks that would at least affect everyone alive on earth plus all future generations. So, this is why Boston called them risks of pan-generational scope and risks which would be at least what Boston called crushing which we can roughly understand as removing everything that would be valuable for those individuals. And one central example of such existential risks are risks of extinction. And these are risks that have received a lot of attention in the EA community already. They have pan-generational scope because they would affect everyone alive and would also remove the rest of the future and they would be crushing because they would remove everything valuable. But S-risks are another type of existential risk which are also conceptually included in this concept of existential risk. They are risks that would be even worse than extinction because they contain a lot of things we disvalue such as for instance intense involuntary suffering and risks that would have even larger scope because they would affect a significant part of the universe.

So, you could think of the Black Mirror story from the beginning and imagine that Greta endures her solitary confinement for the rest of her life and that it’s not only one upload but a large population of such uploads across the universe. Or you could think of something like factory farming with a much, much larger scope for some reason realized in many ways across the whole galaxy. So, I have now explained what conceptually S-risks are. They are risks of severe involuntary suffering on a cosmic scale thereby exceeding the total amount of suffering we’ve seen on earth so far. And this makes them a subclass of existential risk but a distinct subclass from the more well-known extinction risks. So far, I’ve just defined some conceptual term. I’ve called attention to some kind of possibility.

But what may be more relevant is as effective altruists, if reducing S-risks something we can do and if so, if it’s something we should do. And let’s make sure that we understand this question correctly. Because all plausible ethical views agree that intense involuntary suffering is a bad thing. So, I hope you can all agree that reducing S-risks is a good thing. But of course you’re here because you’re interested in effective altruism. That is, you just don’t just want to know whether there is something good you can do or we’re interested in identifying the most good we can do. We realize that doing good has opportunity cost and we really want to make sure to focus our time, focus our money on the most impact we can have. So, the question here really is, and the question I’d like you to discuss is, can reducing S-risks meet this higher bar? For at least some of us, could it be the best use of your time or your money to invest this into reducing as risk as opposed to doing other things that could be good in some sense? And this of course is a very challenging question and I won’t be able to conclusively and comprehensively answer that question today. And to illustrate just how complex this question is and also to make really clear what kind of argument I’m not making here, I’ll first introduce a flawed argument, an argument that doesn’t work for focusing on reducing S-risks. So this argument will roughly go as follows.

First premise, the best thing to do is preventing the worst risks. Second premise, well, S-risks by definition are the worst risk, so conclusion you may think the best thing to do is to prevent S-risks. Now with respect to premise one, let’s get one potential source of misunderstanding out of the way. One way to understand this first premise is that it could be a rock bottom feature of your ethical worldview.

So, you could think whatever you expect to happen in the future, you have some specific additional ethical reasons to focus on preventing the worst case outcomes. Some kind of maximum principle or perhaps prioritarianism applied to the far future. This however is not the sense I’m going to talk about today. So, if your ethical view contains some such principles, I think they give you additional reasons for focusing on S-risks, but this is not what I’m going to talk about. What I’m going to talk about is that there are more criteria that are relevant for identifying the ethically optimal action than the two dimension of risks we have looked at so far.

Right, so far we’ve only looked at if a risk was realized, how bad would the outcome be in terms of its severity and its scope. And in this sense, S-risks are the worst risks, but in this sense I think premise one is not clearly true. Because when deciding what’s the best thing to do, there are more criteria that are relevant and many of you will be well familiar with those criteria because they are rightly getting a lot of attention in the EA community. So, in order to see if reducing S-risks is the best thing we can do, we really need to look at how likely would it be that these S-risks are realized, how easy is reducing them, in other words, how tractable is it and how neglected is this endeavor. Are there already a lot of people or organizations doing it, how much attention is it already getting? So, these criteria clearly are relevant. Even if you are a prioritarian or whatever and think you have a lot of reasons to focus on the worst outcomes, if for instance their probability was zero or there was absolutely nothing we could do to reduce them, it would make no sense to try to do it. So, we need to say something about the probability, the tractability and the neglectedness of these as risks and I’ll offer some initial thoughts on these in the rest of the talk. What about the probability of these S-risks? What I’ll argue for here is that S-risks are at least not much more unlikely than extinction risks from superintelligent AI, which are a class of risks at least some parts of the community take seriously and think we should do something about them.

And I’ll explain why I think this is true and will address two kinds of objections you may have. So, reasons to think that these as risks in fact are too unlikely to focus on them. The first objection could be that well these kind of S-risks they are just too absurd. We can’t even send humans to Mars yet so why should we worry about suffering on cosmic scales you could think.

And in fact when I first encountered related ideas I had a similar immediate intuitive reaction that this is a bit speculative, maybe this is not something I should focus on. But I think we should really be careful to examine such intuitive intuitions because as many of you I guess are well aware there is a large body of psychological research in the heuristics and biases approach that suggests that intuitive assessments of probability by humans are often driven by how easy it is for us to recall a prototypical example of the kind of scenarios we are considering. And for things that have never happened for which there is no historical precedent this leads to us intuitively systematically underestimating their probability known as the absurdity heuristics. So, I think we shouldn’t go with this intuitive reaction but should rather really examine, okay what can we say about the probability of these S-risks.

And if we look at all of our best scientific theories and what the experts are saying about how the future could unfold I think we can identify two not too implausible technological developments that may plausibly lead to the realization of S-risks. This is not to say that these are the only possibilities there may be unknown unknowns things we can’t foresee yet that could also lead to such S-risks but there are some known pathways that could get us into as risk territory. And these two are artificial sentience and super intelligent AI. So artificial sentience simply refers to the idea that the capacity to have subjective experience and in particular the capacity to suffer is in fact not limited in principle to biological animals. But that there could be novel kinds of beings perhaps computer programs stored on silicon based hardware about who’s suffering we would also have reasons to care about. And while this isn’t completely settled in fact few contemporary views and the philosophy of mind would say that artificial sentience is impossible in principle. So, it seems to be a conceptual possibility we should be concerned about. Now how likely is it that this will ever be realized. This may be less clear but in fact here as well we can identify one technological pathway that may lead to artificial sentience and this is the idea of whole brain emulation. Basically just understanding the human brain and sufficient detail so that we could build a functionally equivalent computer simulation of it. And for this technology it’s still not completely certain that we will be able to do it but in fact researchers have looked at this and have outlined a quite detailed roadmap for this technology. So, they’ve identified concrete milestones and remaining uncertainties and have concluded that this definitely seems to be something we should take into account when thinking about the future. So, I’d argue there is a not too implausible technological possibility that we will get to artificial sentience.

I won’t say as much about the second development, super intelligent AI because this is already getting a lot of attention in the EA community. If you aren’t familiar with worries related to super intelligent AI I recommend Nick Bostrom’s excellent book, Super Intelligence and I’ll just add that super intelligent AI presumably could also unlock many more technological capabilities that we would need to get into S-risks territory. So, for instance the capacity to colonize space and spread out sentient beings into larger parts of the universe. This could conceivably be realized by super intelligent AI and I’d also like to add that some scenarios in which the interaction between super intelligent AI and artificial sentience could lead to S-risks scenarios have been discussed by Bostrom in super intelligent and other places under the term mind crime. So, this is something you could search for if you’re interested in related ideas.

So in fact if we look at what we can say about the future I think it would be a mistake to say that S-risks are so unlikely that we shouldn’t care about them. But maybe you have now a different objection. So maybe you’re convinced that okay in terms of the technological capabilities we can’t be sure that these S-risks are just too unlikely but you may think okay vast amounts of suffering there seems to be a pretty specific outcome even if we have much greater technological capabilities it seems unlikely that such an especially bad outcome will be realized. So you could think after all this would require some kind of evil agent some kind of evil intent that actively tries to make sure that we get these vast amounts of suffering. And I think I agree that this seems to be pretty unlikely but here again after reflecting on this a little bit I think we can see that this is only one and perhaps the most implausible route to get into S-risks territory. There also are two other routes I’d like to argue.

The first of these S-risks could arise by accident. So, one class of scenarios how this could happen could be the following. Imagine that the first artificially sentient beings we create aren’t as highly developed as complete human minds but perhaps more similar to non-human animals in that we may create artificially sentient beings with the capacity to suffer but with a limited capability to communicate with us and to signal that they are suffering. In an extreme case we may create beings that are sentient that can suffer but who’s suffering we overlook because there is no possibility of easy communication.

A second scenario where S-risks could be realized without evil intent are the toy example if the toy example of a paperclip maximizer which serves to illustrate the idea what would happen if we create a very powerful super intelligent AI that pursues some unrelated goal. A goal that’s neither closely aligned with our values nor actively evil. And as Nick Bostrom and many people have argued conceivably such a paperclip maximizer could lead to human extinction for instance because it would convert the whole earth and all the matter around here into paperclips because it just wants to maximize the number of paperclips and has no regard for human survival. But it’s only a small further step to worry well what if such a paperclip maximizer runs for instance sentient simulations say for scientific purposes to better understand how to maximize paperclip production or maybe similar to the way that our suffering serves some kind of evolutionary function maybe a paperclip maximizer would create some kind of artificially sentient sub-programs or work who suffering would be instrumentally useful for maximizing the production of paperclips. So, we only need to add very few additional examples and assumptions to see that scenarios which are already getting a lot of attention could not only lead to human extinction but effect to outcomes that would be even worse. Finally to understand the significance of the third route S-risks could be realized as part of a conflict note that it’s often the case then that if we have a large number of agents competing for a shared pie of resources that this can incentivize negative sub-dynamics that lead to very bad outcomes even if none of the agents involved actually actively values those bad outcomes they are just resorting to them in order to out-compete the other agents. For instance look at most wars the countries waging them are rarely intrinsically valuing the suffering and the bloodshed implicated in them but sometimes wars still happen to further the strategic interests of the countries involved.

So I think if we critically examine the situation we are in we should conclude that in fact if we take seriously a lot of the thumb considerations that are already being widely discussed in the community such as risks on super intelligent AI there are only few additional assumptions we need to justify worries about a stress and it’s not like we need to invent some completely made up technologies or need to assume extremely implausible or rare motivations such as sadism or hatred to substantiate worries about S-Risks. So this is why I’ve said that I think S-Risks are at least not much more unlikely than say extinction risks from super intelligent AI. Now of course the probability of S-Risks isn’t the only criterion we need to address as I said we also need to ask how easy is it to reduce those S-Risks. And in fact I think this is a pretty challenging task. We haven’t found any kind of silver bullet yet here but I’d also like to argue that reducing S-Risks is at least minimally tractable even today and one reason for this is that we are arguably already reducing S-Risks. So, as I just said some scenarios how S-Risks could be realized are super intelligent AI goes wrong in some way.

This is why some work in technical AI safety as well as AI policy probably already effectively reduces S-Risks. To give you one example I said that we might be worried about S-Risks arising because of the strategic behavior of say AI agents as part of a conflict. Some work in AI policy that reduces the likelihood of such multi-polar AI scenarios and makes unipolar AI scenarios with less competition more likely could in particular have the effect of reducing S-Risks. Similar with some work in technical AI safety. That being said it seems to me that a lot of the interventions that are currently undertaken reduce S-Risks by accident in a sense they aren’t specifically tailored to reducing S-Risks and there may well be particular say sub-problems within technical AI safety that would be particularly effective at reducing S-Risks specifically and which aren’t getting a lot of attention already. So, for instance to give you one toy example that’s probably hard to realize in precisely this form but illustrates what might be possible. Consider the idea of achieving the goal of an AI being uncontrolled, conditional on our efforts on solving the control problem failing making sure that in such a scenario AI at least doesn’t create say additional sentient simulations or artificially sentient sub-programs. If we could solve this problem through work in technical AI safety we would arguably reduce S-Risks specifically.

Of course there also are more broad interventions that don’t directly aim to influence some kinds of levers that directly affect the far future but would have a more indirect effect on reducing S-Risks. So for instance we could think that strengthening international cooperation will enable us to at some point for instance prevent AI arms races that could again lead to negative sum dynamics that could lead us into S-Risk territory. Similarly because artificial sentience is such a significant worry when thinking about S-Risks we could think that expanding the moral circle and making it more likely that human decision makers in the future would care about artificially sentient beings that this would have a positive effect on reducing S-Risk. That all being said I think it’s fair to say that we currently don’t understand very well how best to reduce S-Risk. One thing we could do if we suspect that there are some low hanging fruits to reap there we could say okay let’s go meta and do research about how best to reduce those S-Risk and in fact this is a large part of what we are doing at the Foundation Research Institute. Now there’s also another aspect of tractability I’d like to talk about. This is not the question how easy is it intrinsically to reduce S-Risk but the question could we raise the required support. For instance can we get sufficient funding to get work on reducing S-Risk off the ground. And one worry we may have here is that well all these talk about suffering on a cosmic scale and so on this will seem too unlikely to many people in other words that S-Risks are just too weird a concern for us to be able to raise significant support and funding to reduce them.

And I think this is to some extent a legitimate worry but I also don’t think that we should be too pessimistic and I think the history of the AI safety field substantiates this assessment. If you think back even 10 years ago you will find that back then worries about extinction risk from super intelligent AI were ridiculed, dismissed or misrepresented and misunderstood as for instance being about the terminator or anything something like that. Today we have Bill Gates blurbing a book talking openly and directly about these risks of super intelligent AI and also related concepts such as mind crime. And so I would argue that the recent history of the AI safety field provides some reason for hope that we are able to push even seemingly weird cause areas sufficiently far into the mainstream, into the window of acceptable discourse that we can raise significant support for them.

Last but not least what about the neglectedness of S-risk? So, as I said some work that’s already underway in the X-Risk area arguably reduces S-risk so reducing S-risk is not totally neglected but I think it’s fair to say that they get much less attention than say extinction risk. In fact I’ve sometimes seen people in the community either explicitly or implicitly equate existential risk and extinction risk which conceptually clearly seems to be untrue. And in fact while some existing interventions may be also effective at reducing S-risks there are few people that are specifically trying to identify interventions that are most effective at reducing S-risk specifically. And I think the foundational research institute is the only EA organization which at an organizational level has the mission of focusing on reducing S-risk. So to summarize I haven’t conclusively answered the question for who exactly reducing S-risk is the best thing to do. I think this depends both on your ethical view and on some empirical questions such as the probability, the tractability and the neglectedness of S-risk. But I have argued S-risks are not much more unlikely than say extinction risk from super intelligent AI and so they warrant at least some attention. And I’ve argued that the most plausible known path that could lead us into S-risk territory so aside from unknown unknowns are AI scenarios that involve the creation of large numbers of artificially sentient beings. And this is why I think among the currently known sources of existential risk the AI risk cost area is unique in also being very relevant for reducing S-risk. Because if we don’t get AI right there seems to be a significant chance that we get into S-risk territory whereas in other areas say an asteroid hitting earth or a deadly pandemic or wiping out a human life it seems much less likely that this could get us into scenarios that would be much, much worse than extinction because they in addition contain a lot of suffering. In this sense if you haven’t considered S-risk before I think this is an update for caring more about the AI risk cost area as opposed to other S-risk cost areas. In this sense some but not all of the current work in the ex-risk area is already effective at reducing S-risk but there seems to be a gap of people and research specifically optimizing for reducing S-risk and trying to find those interventions that are most effective for this particular goal. And I’d argue in some that the Foundational Research Institute in having this unique focus occupies an important niche and I would very much like to see more people join us in that niche. So, people from say other organizations also doing some kind of research that’s hopefully effective at reducing S-risk. So this all being said I hope to have raised some awareness for the worrying prospect of S-risks I don’t think I have convinced all of you that reducing S-risk is the best use of your resources.

I don’t think I could expect this both because our rock bottom ethical views differ to some extent and also because the empirical questions involved are just extremely complex and it seems very hard to reach agreement on them. So, I think realistically those of us who are interested in shaping the for future who are convinced that this is the most important thing to do among those of us we will be faced with a situation where there are people with different priorities in the community and we need to sort out how to manage this situation. And this is why I’d like to end this talk with a vision for this for future shaping community. So, shaping the for future as a metaphor could be seen as being involved in like a long journey. But what I hope I have made clear is that it’s a misrepresentation to frame this journey as involving a binary choice between extinction or utopia. In another sense however I’d argue that this metaphor was apt. We do face a long journey but it’s a journey through hard to traverse territory and on the horizon there is a continuum ranging from a very bad thunderstorm to the most beautiful summer day. And interest in shaping the for future in a sense determines who is with us in the vehicle but it doesn’t necessarily comprehensively answer the question what more precisely to do with the steering wheel.

Some of us are more worried about not getting into the thunderstorm, others are more motivated by the existential hope of maybe arriving at that beautiful sunshine. And it seems hard to reach agreement on what more precisely to do and part of the reasons is that it’s very hard to keep track of the complicated networks of roads far ahead and to see steering in what directions will lead to precisely what outcome. Thisis very hard by contrast we have an easy time seeing who else is with us in the vehicle. And so the concluding thought I’d like to offer is maybe among the most important things we can do is to make sure to compare our maps among the people in the vehicle and to find some way of handling the remaining disagreements without inadvertently derailing the vehicle and getting to an outcome that’s worth for all. Thank you.

I’ll start off with a question that a few people were asking which is something that you said isn’t necessary to be concerned with S-risks but would help shed a little bit of clarity which is besides AI, besides a whole brain emulation and running like uploaded brains are there any other forms of an S-risk that you can try to visualize? Particularly people were trying to figure out ways in which you can work on the problem if you don’t have a concrete sense of the way in which it might manifest. So, I do in fact think that artificial the most plausible scenarios we can foresee today involve artificial sentience partly because many people have talked about that it artificial sentience would come with novel challenges for instance it would presumably be very easy to spawn a large number of artificially sentient beings. There are a lot of efficiency advantages of silicon based substrates as compared to biological substrates and also we can observe that in many other areas the worst outcomes contain most of the expected value like the predominance of fat tailed institutions for instance with respect to the casualties in war and diseases and so on. So, it seems somewhat plausible to me that if we are concerned about reducing as much suffering as possible in expectation we should focus on these very bad outcomes and that most of these for various reasons involve artificial sentience. That being said I think there are some scenarios especially in future scenarios where we don’t have this archetypical intelligence explosion scenario and the heart takeoff where we are faced with a more messy and complex future where there are maybe a lot of factions controlling AI and using that for various purposes that we could face risks that maybe aren’t as don’t have as a higher scope as the worst scenarios involving artificial sentience but would be maybe more akin to factory farming. Some kind of novel technology that would be misused in some way maybe just because people don’t sufficiently care about the consequences and they pursue some kind of say economic goals and yeah create inadvertently create large amounts of suffering similar to the way we can see this happening today in for example the animal industry. You said that the debate is still kind of out whether or not you could actually have extend your moral concern to something that’s in a silicon substrate that isn’t flesh and bone in the way that we are.

Can you make the case for why we might in fact care about something that is an uploaded brain and isn’t a brain in the way that we generally think of it? So one suggestive thought experiment that has been discussed in the philosophy of mind is imagine that you replace your brain not all at once but step by step with a silicon based hardware. So, you start with replacing just one neuron with some kind of chip that serves the same function. It seems intuitively clear that this doesn’t make you less sentient or that we should care less about you in that way. And now you can imagine step by step replacing your brain one neuron at a time with a computer in some sense. And it seems that you have a hard time pinpointing any particular point in this transition where you say oh well now the situation flips and we should stop caring about the same after all the same information processing that’s still going on in that brain. And yeah but there is a large body of literature in the philosophy of mind discussing this question. And assuming that these brains in do in fact have the capacity to suffer what reason would we have to think that it would be advantageous say for a superintelligence to emulate lots of brains in a way in which they suffer rather than just have them exist without any sort of positive or negative feeling.

So, one reason we may be worried is that if we look at the current successes in the AI field we see that they are often driven by machine learning techniques. That is techniques where we don’t program the knowledge and the capabilities. We think the AI system should have directly into the system but rather set up some kind of algorithm that can be trained and that can learn via trial and error receiving some information about how good or bad it’s doing and thereby increasing its capabilities. Now it seems very unlikely that the current machine learning techniques that involve giving some kind of reward signals to algorithms should be concerning to us to a large extent. I don’t want to claim that current say reinforcement learning algorithms are suffering to a large extent but we may be worried that similar architectures where the capabilities of artificially sentient beings arise by them being trained in some things receiving some kind of reward signal that this is a feature of AI systems that will persist even at a point when the sentience of these algorithms is realized to a larger extent. So, in some way this is similar to the way as I mentioned our suffering serves some kind of evolutionary function that helps us navigating in the world and in fact people who don’t feel pain have a great deal of difficulties for this reason because they don’t intuitively avoid damaging outcomes. And yeah so this is certainly a longer discussion but I hope you can give a brief answer to this one.

A couple of people also wanted to know you have a particular suffering focus in the case of S-Risk but some people wonder that perhaps an agent might actually just prefer, it’s not clear whether they would prefer death or suffering, they might actually prefer to exist even if their experiences are pretty negative. Is this a choice that you would be making on behalf of the agents that you’re considering in your moral realm when you’re trying to mitigate an S-Risk, is this a necessary precondition for caring about S-Risks? So, I think whatever your rock bottom ethical views there are nearly always are prudential reasons for considering the preferences of other agents. So, if I was faced with a situation where I think oh there is like some kind of being whose experiences are so negative that I think in a consequentialist sense it would be better that this being doesn’t exist but this being has for whatever reason a strong preference to exist and then argues with me oh well should I continue or not and so on and so forth. I think there often can be prudential reasons to take these preferences into account. I think actually there will be some kind of convergence between different ethical views on the question of how to take such hypothetical preferences into account.

That being said I think it’s fairly implausible to claim that no imaginable amount of suffering would be intrinsically worse than non-existence. This seems fairly implausible to me so one intuition pump for this could be if you face the choice between one hour of sleep or one hour of torture what do you prefer. It seems fairly clear I would guess to most of us that one hour of sleep having no experience at all is if the better choice in the sense. And you said that hopefully we’ll come to some sort of convergence on what the true moral philosophy in so far as there is one is but there might also be reason to think that we wouldn’t do this in the time scales of the development of a super intelligent AI or the development of whole brain emulations that we can run on many computers. What do we do in that case where we haven’t solved moral philosophy in time? So that I think is a very important question because to me it seems to be fairly likely that there won’t be such convergence at least to a large extent of detail.