Do you think it would be better if no one who worked at OpenAI / Anthropic / Deepmind worked on safety? If those organizations devoted less of their budget to safety? (Or do you think we should want them to hire for those roles, but hire less capable or less worried people, so individuals should avoid potentially increasing the pool of talent from which they can hire?)
Derek Shiller
EA should be willing to explore all potentially fruitful avenues of mission fulfillment without regard to taboo.
In general, where it doesn’t directly relate to cause areas of principle concern to effective altruists, I think EAs should strive to respect others’ sacred cows as much as possible. Effective Altruism is a philosophy promoting practical action. It would be harder to find allies who will help us achieve our goals if we are careless about the things other people care a lot about.
I think you’re right that we don’t provide a really detailed model of the far future and we underestimate* expected value as a result. It’s hard to know how to model the hypothetical technologies we’ve thought of, let alone the technologies that we haven’t. These are the kinds of things you have to take into consideration when applying the model, and we don’t endorse the outputs as definitive, even once you’ve tailored the parameters to your own views.
That said, I do think the model has a greater flexibility than you suggest. Some of these options are hidden by default, because they aren’t relevant given the cutoff year of 3023 we default to. You can see them by extending that year far out. Our model uses parameters for expansion speed and population per star. It also lets you set the density of stars. If you think that we’ll expand and near the speed of light and colonize every brown dwarf, you can set that. If you think each star will host a quintillion minds, you can set that too. We don’t try to handle relative welfare levels for future beings; we just assume their welfare is the same as ours. This is probably pessimistic. We considered changing this, but it actually doesn’t make a huge difference to the overall shape of the results, so we didn’t consider it a priority. The same goes for clock speed differences. If you want to represent this within the model as written, you can just inflate the population per star. What the model can’t do is capture non-cubic (and non-static) population growth rates. It also breaks down in the real far future, and we don’t model the end of the universe.
Perhaps you object to parameter settings we chose as defaults. Whatever defaults we picked would be controversial. In response, let me just stress that they’re not intended as our answers to these questions. They are just a flexible starting point for people to explore.
* My guess is that the EV of surviving to the far future is infinite, if it isn’t undefined.
A couple of thoughts:
-
This argument doesn’t seem specific to longtermism. You could make the same case for short-term animal welfare. If you’ll be slightly more effective at passing sweeping changes to mitigate the harms of factory farming if you eat a chicken sandwich every day, the expectation of doing so is highly net positive even if you only care about chickens in the near future.
-
This argument doesn’t seem specific to veganism. You could make the same case for being a jerk in all manner of ways. If keying strangers’ cars helped you relax and get insight into the alignment problem, then, the same reasoning might suggest you should do it.
This isn’t to say the argument is wrong, but I find the implications very distasteful.
-
I think it is valuable to have this stuff on record. If it isn’t recorded anywhere, then anyone who wants to reference this position in another academic work—even if it is the consensus within a field—is left presenting it in a way that makes it look like their personal opinion.
It seems like an SBF-type-figure could justify any action if the lives of trillions of future people are in the balance.
This doesn’t seem specific to utilitarianism. I think most ethical views would suggest that many radical actions would be acceptable if billions of lives hung in the balance. The ethical views that wouldn’t allow such radical actions would have their own crazy implications. Utilitarianism does make it easier to justify such actions, but with numbers so large I don’t think it generally makes a difference.
Big fan of your sequence!
I’m curious how you think about bounded utility function. Its not something I’ve thought about much. The following sort of case seems problematic.
Walking home one night from a lecture on astrophysics where you learned about the latest research establishing the massive size of the universe, you come across a child drowning in a pond. The kid is kicking and screaming trying to stay above the water. You can see the terror in his eyes and you know that it’s going to get painful when the water starts filling his lungs. You see is mother, off in the distance, screaming and running. Something just tells you she’ll never get over this. It will wreck her marriage and her career. There’s a life preserver in easy reach. You could save the child without much fuss. But you recall your lecture the oodles and oodles of people living on other planets and figure that we must be very near the bound of total value for the universe, so the kid’s death can’t be of more than the remotest significance. And there’s a real small chance that solipsism is true, in which case your whims matter much more (we’re not near the bounds) and satisfying them will make a much bigger difference to total value. The altruistic thing to do is to not make the effort, which could be mildly unpleasant, even though it very likely means the kid will die an agonizing death and his mother will mourn for decades.
That seems really wrong. Much more so than thinking that fanaticism is unreasonable.
- 16 Aug 2022 1:53 UTC; 2 points) 's comment on Concerns with Difference-Making Risk Aversion by (
- Weighing in solipsism by 28 Apr 2023 16:57 UTC; 1 point) (
The problem with considering optics is that it’s chaotic.
The world is chaotic, and everything EAs try to do have a largely unpredictable long-term effect because of complex dynamic interactions. We should try to think through the contingencies and make the best guess we can, but completely ignoring chaotic considerations just seems impossible.
It’s a better heuristic to focus on things which are actually good for the world, consistent with your values.
This sounds good in principle, but there are a ton of things that might conceivably be good-but-for-pr-reasons where the pr reasons are decisive. E.g. should EAs engage in personal harassment campaigns against productive ML researchers in order to slow AI capabilities research? Maybe that would be good if it weren’t terrible PR, but I think we very obviously should not do it because it would be terrible PR.
There is some nuance to the case that seems to get overlooked in the poll. I feel completely free to express opinions in a personal capacity that might be at odds with my employer, but I also feel that there are some things it would be inappropriate to say while carrying out my job without running it by them first. It seems like you’re interested in the latter feeling, but the poll is naturally interpreted as addressing the former.
Toby Ord argues that this is incoherent because there are no natural units in which to measure happiness and suffering, and therefore it’s unclear what it even means to put them on the same scale.
One problem might be that there are no natural units on which to measure happiness and suffering. Another is that there are too many. If there are a hundred thousand different ways to put happiness and suffering on the same scale and they all differ in the exchange rate they imply, then it seems you’ve got the same problem. Your example of comparisons in terms of elementary particles feels somewhat arbitrary, which makes me think this may be an issue.
I think the greater potential concern is false-positives on consciousness, not false negatives
This is definitely a serious worry, but it seems much less likely to me.
One way this could happen is if we build large numbers of general purpose AI systems that we don’t realize are conscious and/or can suffer. However, I think that suffering is a pretty specialized cognitive state that was designed by natural selection for a role specific to our cognitive limitations and not one we are likely encounter by accident while building artificial systems. (It seems more likely to me that digital minds won’t suffer, but will have states that are morally relevant that we don’t realize are morally relevant because we’re so focused on suffering.)
Another way this could happen is if we artificially simulate large numbers of biological minds in detail. However, it seems very unlikely to me that we will ever run those simulations and very unlikely that we miss the potential for accidental suffering if we do. At least in the short term, I expect most plausible digital minds will be intentionally designed to be conscious, which I think makes the risks of mistakenly believing they’re conscious more of a worry.
That said, I’m wary of trying adjudicate which is a more concerning for topics that are still so speculative.
proposed “p-risk” after “p-zombies
I kinda like “z-risk”, for similar reasons.
I believe Marcus and Peter will release something before long discussing how they actually think about prioritization decisions.
Generally the claims here fall prey to the fallacy of unevenly applying the possibility of large consequences to some acts where you highlight them and not to others, such that you wind up neglecting more likely paths to large consequences.
Could you be more specific about the claims that I make that involve this fallacy? This sounds to me like a general critique of Pascal’s mugging, which I don’t think fits the case that I’ve made. For instance, I suggested that the simple MWI has a probability ~ and would mean that it is trivially possible if true to generate in value, where v is all the value currently in the world. The expected value of doing things that might cause 1000 successive branchings is ~ where v is all the value in the world. Do you think that there is a higher probability way to generate a similar amount of value?
then making a much more advanced and stable civilization is far more promising for realizing things related to that.
I suppose your point might be something like, absurdist research is promising, and that is precisely why we need humanity to spread throughout the stars. Just think of how many zany long-shot possibilities we’ll get to pursue! If so, that sounds fair to me. Maybe that is what the fanatic would want. It’s not obvious that we should focus on saving humanity for now and leave the absurd research for later. Asymmetries in time might make us much more powerful now than later, but I can see why you might think that. I find it a rather odd motivation though.
My impression is that EAs also often talk about ethical consequentialism when they mean something somewhat different. Ethical consequentialism is traditionally a theory about what distinguishes the right ways to act from the wrong ways to act. In certain circumstances, it suggests that lying, cheating, rape, torture, and murder can be not only permissible, but downright obligatory. A lot of people find these implications implausible.
Ethical consequentialists often think what they do because they really care about value in aggregate. They don’t just want to be happy and well off themselves, or have a happy and well off family. They want everyone to be happy and well off. They want value to be maximized, not distributed in their favor.
A moral theory that gets everyone to act in ways that maximize value will make the world a better place. However, it is consistent to think that consequentialism is wrong about moral action and to nonetheless care primarily about value in aggregate. I get the impression that EAs are more attached to the latter than the former. We generally care that things be as good as they can be. We have less a stake in whether torture is a-ok if the expected utility is positive. The EA attitude is more of a ‘hey, lets do some good!’ and less of a ‘you’re not allowed to fail to maximize value!’. This seems like an important distinction.
Thanks for your engagement and these insightful questions.
I consistently get an error message when I try to set the CI to 50% in the OpenPhil bar (and the URL is crazy long!)
That sounds like a bug. Thanks for reporting!
(The URL packs in all the settings, so you can send it to someone else—though I’m not sure this is working on the main page. To do this, it needs to be quite long.)
Why do we have probability distributions over values that are themselves probabilities? I feel like this still just boils down to a single probability in the end.
You’re right, it does. Generally, the aim here is just conceptual clarity. It can be harder to assess the combination of two probability assignments than those assignments individually.
Why do we sometimes use ? It seems unnecessarily confusing.
Yeah. It has been a point of confusion within the team too. The reason for cost per DALY is that is a metric that is often used by people making allocation decisions. However, it isn’t a great representation for Monte Carlo simulations where a lot of outcomes involve no effect, because the cost per DALY is effectively infinite. This has some odd implications. For our purposes, DALYs per $1000 is a better representation. To try to accommodate both considerations, we include both values in different places.
OK, but what if life is worse than 0, surely we need a way to represent this as well? My vague memory from the moral weights series was that you assumed valence is symmetric about 0, so perhaps the more sensible unit would be the negative of the value of a fully content life.
The issue here is that interventions can affect different levels of suffering. For instance, a corporate campaign might include multiple asks that affect animals in different ways. We could have made the model more complicated by incorporating its effect on each different level. Instead, we simplified by ‘summarizing’ the impact with one level. We calibrated with research on the impact of similar afflictions in humans. You can represent a negative value just by choosing a higher number of hours than actually suffered. Think of it in terms of the amount of normal life that that suffering would balance out. If it is really bad, one hour of suffering might be as bad as weeks of normal life would be good.
The intervention is assumed to produce between 160 and 3.6K suffering-years per dollar (unweighted) condition on chickens being sentient.” This input seems unhelpfully coarse-grained, as it seems to hide a lot of the interesting steps and doesn’t tell me anything about how these numbers are estimated, and it is not the sort of thing I can intelligently just choose my own numbers for.
There is a balance between accuracy and model configurability. In some places, we want to include numbers that are based on other research that we thought was likely to be accurate, but where we couldn’t directly translate into the parameters of the model. I would like to convert those assessments into the terms of the model, maybe backtracking to see what parameters get a similar answer, but this wasn’t a priority.
In the small-scale biorisk project, I never seem to get more than about 1000 DALYs per $1000, even when I crank expansion speed to 0.9c and length of future to 1e8, and the annual extinction risk in era 4 to 1e-8. Why is this? Yes 150,000 is too few, but I thought I should at least see some large effect when I change key parameters by several OOMs. Not really sure what is going on here, I’ll be interested if you replicate this, and whether there is a bug or I am just misunderstanding something.
Our estimates include both calculations of catastrophic events and extinction. For the small-scale biorisk, the chance of a catastrophic event is relatively high, but the chance of extinction is low. I think you’re seeing the results of catastrophic events and no extinction events. When I up the probability of extinction to be higher, and include the far future, I see very large numbers. (E.g. https://bit.ly/ccm-bio-high-risk).
(1) Unfortunately, we didn’t record any predictions beforehand. It would be interesting to compare. That said, the process of constructing the model is instructive in thinking about how to frame the main cruxes, and I’m not sure what questions we would have thought were most important in advance.
(2) Monte Carlo methods have the advantage of flexibility. A direct analytic approach will work until it doesn’t, and then it won’t work at all. Running a lot of simulations is slower and has more variance, but it doesn’t constrain the kind of models you can develop. Models change over time, and we didn’t want to limit ourselves at the outset.
As for whether such an approach would work with the model we ended up with: perhaps, but I think it would have been very complicated. There are some aspects of the model that seem to me like they would be difficult to assess analytically—such as the breakdown of time until extinction across risk eras with and without the intervention, or the distinction between catastrophic and extinction-level risks.
We are currently working on incorporating some more direct approaches into our model where possible in order to make it more efficient.
Lesswrong related to how “the number of pigs in gestation crates (at least) doubles!” is probably a confused way of thinking.
Sure, but how small is the probability that it isn’t? It has to be really small to counteract the amount of value doubling would provide.
That seems unphysical, since we’re saying that even if something made no actual physical difference, it can still make a difference for subjective experience.
The neuron is still there, so its existing-but-not-firing makes a physical difference, right? Not firing is as much a thing a neuron can do as firing. (Also, for what it’s worth, my impression is that cognition is less about which neurons are firing and more about what rate they are firing at and how their firing is coordinated with that of other neurons.)
But neurons don’t seem special, and if you reject counterfactual robustness, then it’s hard to see how we wouldn’t find consciousness everywhere, and not only that, but maybe even human-like experiences, like the feeling of being tortured, could be widespread in mundane places, like in the interactions between particles in walls.
The patterns of neural firing involved in conscious experiences are surely quite complicated. Why think that we would find similar patterns anywhere outside of brains?
Thanks for highlighting these concerns. This is something I fretted about before writing this, and I condensed my thoughts into footnote 1. Let me expand on them here:
1.) These sorts of studies are long out of vogue. I don’t believe my engaging with them (especially on the EA forum, which confers little academic prestige) will encourage any similar experiments to be carried out in the future. I also don’t think it will affect the status of the researchers or the trajectory of their careers.
2.) There are a huge number of experiments that are callously harmful to sentient creatures like rats, as you note. Decortication studies stand out because they involve harms to bodily and mental integrity, which we find particularly repulsive, but many experiments in psychology, medicine, and neuroscience routinely involve killing their test subjects. I’m hesitant to disengage from such research (or to refuse to benefit from it, or let other animals benefit from it) entirely.
3.) All sorts of work indirectly contributes to animal suffering. It is conceivable to me that more suffering is caused by poisoning rats / mice around your average university building, or to provide food for the average university conference, than was caused by these studies to the animals involved. Avoiding engaging with work that involves avoidable animal suffering is extremely difficult. I don’t think it makes sense to disengage with work just because the harms it causes are more obvious.
4.) Understanding consciousness is important for cause prioritization. These sorts of studies have the potential to tell us a lot that might bear on how we think about projects aiming to benefit fish or insects. If they can help us direct funds more effectively for animals, we should pay attention to them.
5.) Animal activists have a reputation for naivete and credulity. Engaging substantively with science, which necessarily includes studies that cruelly harm animals, may help us to be taken more seriously.
Suppose you’ve been captured by some terrorists and you’re tied up with your friend Eli. There is a device on the other side of the room you that you can’t quite make out. Your friend Eli says that he can tell (he’s 99% sure) it is a bomb and that it is rigged to go off randomly. Every minute, he’s confident there’s a 50-50 chance it will explode, killing both of you. You wait a minute and it doesn’t explode. You wait 10. You wait 12 hours. Nothing. He starts eying the light fixture, and say’s he’s pretty sure there’s a bomb there too. You believe him?