Do you think it would be better if no one who worked at OpenAI / Anthropic / Deepmind worked on safety? If those organizations devoted less of their budget to safety? (Or do you think we should want them to hire for those roles, but hire less capable or less worried people, so individuals should avoid potentially increasing the pool of talent from which they can hire?)
Derek Shiller
Rethink Priorities’ Cross-Cause Cost-Effectiveness Model: Introduction and Overview
The importance of getting digital consciousness right
Fanatical EAs should support very weird projects
Notes on the risks and benefits of kidney donation
EA should be willing to explore all potentially fruitful avenues of mission fulfillment without regard to taboo.
In general, where it doesn’t directly relate to cause areas of principle concern to effective altruists, I think EAs should strive to respect others’ sacred cows as much as possible. Effective Altruism is a philosophy promoting practical action. It would be harder to find allies who will help us achieve our goals if we are careless about the things other people care a lot about.
Implementational Considerations for Digital Consciousness
What might decorticate rats tell us about the distribution of consciousness?
I think you’re right that we don’t provide a really detailed model of the far future and we underestimate* expected value as a result. It’s hard to know how to model the hypothetical technologies we’ve thought of, let alone the technologies that we haven’t. These are the kinds of things you have to take into consideration when applying the model, and we don’t endorse the outputs as definitive, even once you’ve tailored the parameters to your own views.
That said, I do think the model has a greater flexibility than you suggest. Some of these options are hidden by default, because they aren’t relevant given the cutoff year of 3023 we default to. You can see them by extending that year far out. Our model uses parameters for expansion speed and population per star. It also lets you set the density of stars. If you think that we’ll expand and near the speed of light and colonize every brown dwarf, you can set that. If you think each star will host a quintillion minds, you can set that too. We don’t try to handle relative welfare levels for future beings; we just assume their welfare is the same as ours. This is probably pessimistic. We considered changing this, but it actually doesn’t make a huge difference to the overall shape of the results, so we didn’t consider it a priority. The same goes for clock speed differences. If you want to represent this within the model as written, you can just inflate the population per star. What the model can’t do is capture non-cubic (and non-static) population growth rates. It also breaks down in the real far future, and we don’t model the end of the universe.
Perhaps you object to parameter settings we chose as defaults. Whatever defaults we picked would be controversial. In response, let me just stress that they’re not intended as our answers to these questions. They are just a flexible starting point for people to explore.
* My guess is that the EV of surviving to the far future is infinite, if it isn’t undefined.
A couple of thoughts:
-
This argument doesn’t seem specific to longtermism. You could make the same case for short-term animal welfare. If you’ll be slightly more effective at passing sweeping changes to mitigate the harms of factory farming if you eat a chicken sandwich every day, the expectation of doing so is highly net positive even if you only care about chickens in the near future.
-
This argument doesn’t seem specific to veganism. You could make the same case for being a jerk in all manner of ways. If keying strangers’ cars helped you relax and get insight into the alignment problem, then, the same reasoning might suggest you should do it.
This isn’t to say the argument is wrong, but I find the implications very distasteful.
-
I think it is valuable to have this stuff on record. If it isn’t recorded anywhere, then anyone who wants to reference this position in another academic work—even if it is the consensus within a field—is left presenting it in a way that makes it look like their personal opinion.
It seems like an SBF-type-figure could justify any action if the lives of trillions of future people are in the balance.
This doesn’t seem specific to utilitarianism. I think most ethical views would suggest that many radical actions would be acceptable if billions of lives hung in the balance. The ethical views that wouldn’t allow such radical actions would have their own crazy implications. Utilitarianism does make it easier to justify such actions, but with numbers so large I don’t think it generally makes a difference.
Big fan of your sequence!
I’m curious how you think about bounded utility function. Its not something I’ve thought about much. The following sort of case seems problematic.
Walking home one night from a lecture on astrophysics where you learned about the latest research establishing the massive size of the universe, you come across a child drowning in a pond. The kid is kicking and screaming trying to stay above the water. You can see the terror in his eyes and you know that it’s going to get painful when the water starts filling his lungs. You see is mother, off in the distance, screaming and running. Something just tells you she’ll never get over this. It will wreck her marriage and her career. There’s a life preserver in easy reach. You could save the child without much fuss. But you recall your lecture the oodles and oodles of people living on other planets and figure that we must be very near the bound of total value for the universe, so the kid’s death can’t be of more than the remotest significance. And there’s a real small chance that solipsism is true, in which case your whims matter much more (we’re not near the bounds) and satisfying them will make a much bigger difference to total value. The altruistic thing to do is to not make the effort, which could be mildly unpleasant, even though it very likely means the kid will die an agonizing death and his mother will mourn for decades.
That seems really wrong. Much more so than thinking that fanaticism is unreasonable.
- 16 Aug 2022 1:53 UTC; 2 points) 's comment on Concerns with Difference-Making Risk Aversion by (
- Weighing in solipsism by 28 Apr 2023 16:57 UTC; 1 point) (
The problem with considering optics is that it’s chaotic.
The world is chaotic, and everything EAs try to do have a largely unpredictable long-term effect because of complex dynamic interactions. We should try to think through the contingencies and make the best guess we can, but completely ignoring chaotic considerations just seems impossible.
It’s a better heuristic to focus on things which are actually good for the world, consistent with your values.
This sounds good in principle, but there are a ton of things that might conceivably be good-but-for-pr-reasons where the pr reasons are decisive. E.g. should EAs engage in personal harassment campaigns against productive ML researchers in order to slow AI capabilities research? Maybe that would be good if it weren’t terrible PR, but I think we very obviously should not do it because it would be terrible PR.
Google could build a conscious AI in three months
There is some nuance to the case that seems to get overlooked in the poll. I feel completely free to express opinions in a personal capacity that might be at odds with my employer, but I also feel that there are some things it would be inappropriate to say while carrying out my job without running it by them first. It seems like you’re interested in the latter feeling, but the poll is naturally interpreted as addressing the former.
Toby Ord argues that this is incoherent because there are no natural units in which to measure happiness and suffering, and therefore it’s unclear what it even means to put them on the same scale.
One problem might be that there are no natural units on which to measure happiness and suffering. Another is that there are too many. If there are a hundred thousand different ways to put happiness and suffering on the same scale and they all differ in the exchange rate they imply, then it seems you’ve got the same problem. Your example of comparisons in terms of elementary particles feels somewhat arbitrary, which makes me think this may be an issue.
I think the greater potential concern is false-positives on consciousness, not false negatives
This is definitely a serious worry, but it seems much less likely to me.
One way this could happen is if we build large numbers of general purpose AI systems that we don’t realize are conscious and/or can suffer. However, I think that suffering is a pretty specialized cognitive state that was designed by natural selection for a role specific to our cognitive limitations and not one we are likely encounter by accident while building artificial systems. (It seems more likely to me that digital minds won’t suffer, but will have states that are morally relevant that we don’t realize are morally relevant because we’re so focused on suffering.)
Another way this could happen is if we artificially simulate large numbers of biological minds in detail. However, it seems very unlikely to me that we will ever run those simulations and very unlikely that we miss the potential for accidental suffering if we do. At least in the short term, I expect most plausible digital minds will be intentionally designed to be conscious, which I think makes the risks of mistakenly believing they’re conscious more of a worry.
That said, I’m wary of trying adjudicate which is a more concerning for topics that are still so speculative.
proposed “p-risk” after “p-zombies
I kinda like “z-risk”, for similar reasons.
I believe Marcus and Peter will release something before long discussing how they actually think about prioritization decisions.
Suppose you’ve been captured by some terrorists and you’re tied up with your friend Eli. There is a device on the other side of the room you that you can’t quite make out. Your friend Eli says that he can tell (he’s 99% sure) it is a bomb and that it is rigged to go off randomly. Every minute, he’s confident there’s a 50-50 chance it will explode, killing both of you. You wait a minute and it doesn’t explode. You wait 10. You wait 12 hours. Nothing. He starts eying the light fixture, and say’s he’s pretty sure there’s a bomb there too. You believe him?