I ran the EA Berkeley group and later the UWashington group, and even this estimate seems high to me (but it would be within my 90% confidence bound, whereas 2000 is definitely not in it).
Therefore, it is a straw man argument that NUs don’t value life or positive states, because NUs value them instrumentally, which may translate into substantial practical efforts to protect them (compared even with someone who claims to be terminally motivated by them).
By my understanding, a universe with no conscious experiences is the best possible universe by ANU (though there are other equally good universes as well). Would you agree with that?
If so, that’s a strong reason for me to reject it. I want my ethical theory to say that a universe with positive conscious experiences is strictly better than one with no conscious experiences.
I was going to post a few lists that hadn’t already been posted, but this one had all of them already :)
I think 4, 5 and 6 are all valid even if you take the CAIS view. Could you explain how you think those depend on the AGI being an independent agent?
Plausibly 2 and 3 also apply to CAIS, though those are more ambiguous.
Actually, my summary of that post initially dropped the obligation frame because of these reasons :P (Not intentionally, since I try to have objective summaries, but I basically ignored the obligation point while reading and so forgot to put it in the summary.)
I do think the opportunity frame is much more reasonable in that setting, because “human safety problems” are something that you might have been resigned to in the past, and AI design is a surprising option that might let us fix them, so it really does sound like good news. On the other hand, the surprising part about effective altruism is “people are dying for such preventable reasons that we can stop it for thousands of dollars”, which is bad news that it’s really hard to be excited by.
Not sure. A few hypotheses:
Arxiv sanity has become better at predicting what I care about as I’ve given it more data. I don’t think this is the whole story because the absolute number of papers I see on Twitter has gone down.
I did create my Twitter account primarily for academic stuff, but it’s possible that over time Twitter has learned to show me non-academic stuff that is more attention-grabbing or controversial, despite me trying not to click on those sorts of things.
Academics are promoting their papers less on Twitter.
Not the OP, but the Alignment Newsletter (which I write) should help for technical AI safety. I source from newsletters, blogs, Arxiv Sanity and Twitter (though Twitter is becoming more useless over time). I’d imagine you could do the same for other fields as well.
these sorts of techniques have been applied for decades and have never achieved anything close to human level AI
We also didn’t have the vast amounts of compute that we have today.
other parts of Bostrom’s argument rely upon much broader conceptions of intelligence that would entail the AI having common sense.
My claim is that you can write a program that “knows” about common sense, but still chooses actions by maximizing a function, in which case it’s going to interpret that function literally and not through the lens of common sense. There is currently no way that the “choose actions” part gets routed through the “common sense” part the way it does in humans. I definitely agree that we should try to build an AI system which does interpret goals using common sense—but we don’t know how to do that yet, and that is one of the approaches that AI safety is considering.
I agree with the prediction that AGI systems will interpret goals with common sense, but that’s because I expect that we humans will put in the work to figure out how to build such systems, not because any AGI system that has the ability to use common sense will necessarily apply that ability to interpreting its goals.
If we found out today that someone created our world + evolution in order to create organisms that maximize reproductive fitness, I don’t think we’d start interpreting our sex drive using “common sense” and stop using birth control so that we more effectively achieved the original goal we were meant to perform.
I’m not really arguing for Bostrom’s position here, but I think there is a sensible interpretation of it.
Goals/motivation = whatever process the AI uses to select actions.
There is an implicit assumption that this process will be simple and of the form “maximize this function over here”. I don’t like this assumption as an assumption about any superintelligent AI system, but it’s certainly true that our current methods of building AI systems (specifically reinforcement learning) are trying to do this, so at minimum you need to make sure that we don’t build AI using reinforcement learning, or that we get it’s reward function right, or that we change how reinforcement learning is done somehow.
If you are literally just taking actions that maximize a particular function, you aren’t going to interpret them using common sense, even if you have the ability to use common sense. Again, I think we could build AI systems that used common sense to interpret human goals—but this is not what current systems do, so there’s some work to be done here.
The arguments you present here are broadly similar to ones that make me optimistic that AI will be good for humanity, but there is work to be done to get there from where we are today.
my impression was, that progress was quite jumpy at times, instead of slow and steady.
So let’s say you have an Artificial Intelligence that thinks enormously faster than a human.
But why didn’t you have an AI that thinks only somewhat faster than a human before that?
My math-intuition says “that’s still not well-defined, such reasons may not exist”.
To which you might say “Well, there’s some probability they exist, and if they do exist, they trump everything else, so we should act as though they exist.”
My intuition says “But the rule of letting things that could exist be the dominant consideration seems really bad! I could invent all sorts of categories of things that could exist, that would trump everything I’ve considered so far. They’d all have some small probability of existing, and I could direct my actions any which way in this manner!” (This is what I was getting at with the “meta-oughtness” rule I was talking about earlier.)
To which you might say “But moral reasons aren’t some hypothesis I pulled out of the sky, they are commonly discussed and have been around in human discourse for millennia. I agree that we shouldn’t just invent new categories and put stock into them, but moral reasons hardly seem like a new category.”
And my response would be “I think moral reasons of the type you are talking about mostly came from the human tendency to anthropomorphize, combined with the fact that we needed some way to get humans to coordinate. Humans weren’t likely to just listen to rules that some other human made up, so the rules had to come from some external source. And in order to get good coordination, the rules needed to be followed, and so they had to have the property that they trumped any prudential reasons. This led us to develop the concept of rules that come from some external source and trump everything else, giving us our concept of moral reasons today. Given that our concept of “moral reasons” probably arose from this sort of process, I don’t think that “moral reasons” is a particularly likely thing to actually exist, and it seems wrong to base your actions primarily on moral reason. Also, as a corollary, even if there do exist reasons that trump all other reasons, I’m more likely to reject the intuition that it must come from some external source independent of humans, since I think that intuition was created by this non-truth-seeking process I just described.”
Okay, cool, I think I at least understand your position now. Not sure how to make progress though. I guess I’ll just try to clarify how I respond to imagining that I held the position you do.
From my perspective, the phrase “moral reason” has both the connotation that it is external to humans and that it trumps all other reasons, and that’s why the intuition is so strong. But if it is decomposed into those two properties, it no longer seems (to me) that they must go together. So from my perspective, when I imagine how I would justify the position you take, it seems to be a consequence of how we use language.
What I have most moral reason to do is what there is most reason to do impartially considered (i.e. from the point of view of the universe)
My intuitive response is that that is an incomplete definition and we would also need to say what impartial reasons are, otherwise I don’t know how to identify the impartial reasons.
4. I don’t think I understand the set up of this question—it doesn’t seem to make a coherent sentence to replace X with a number in the way you have written it.
I did mean for you to replace X with a phrase, not a number.
If my intuition here is right then moral reasons must always trump prudential reasons. Note I don’t have anything more to offer than this intuition, sorry if I made it seem like I did!
Your intuition involves the complex phrase “moral reason” for which I could imagine multiple different interpretations. I’m trying to figure out which interpretation is correct.
Here are some different properties that “moral reason” could have:
1. It is independent of human desires and goals.
2. It trumps all other reasons for action.
3. It is an empirical fact about either the universe or math that can be derived by observation of the universe and pure reasoning.
My main claim is that properties 1 and 2 need not be correlated, whereas you seem to have the intuition that they are, and I’m pushing on that.
A secondary claim is that if it does not satisfy property 3, then you can never infer it and so you might as well ignore it, but “irreducibly normative” sounds to me like it does not satisfy property 3.
Here are some models of how you might be thinking about moral reasons:
a) Moral reasons are defined as the reasons that satisfy property 1. If I think about those reasons, it seems to me that they also satisfy property 2.
b) Moral reasons are defined as the reasons that satisfy property 2. If I think about those reasons, it seems to me that they also satisfy property 1.
c) Moral reasons are defined as the reasons that satisfy both property 1 and property 2.
My response to a) and b) are of the form “That inference seems wrong to me and I want to delve further.”
My response to c) is “Define prudential reasons as the reasons that satisfy property 2 and not-property 1, then prudential reasons and moral reasons both trump all other reasons for action, which seems silly/strange.”
Not if the best thing to do is actually what the supreme being said, and not what you think is right, which is (a natural consequence of) the argument in this post.
(Tbc, I do not agree with the argument in the post.)
There seems to be something that makes you think that moral reasons should trump prudential reasons. The overall thing I’m trying to do is narrow down on what that is. In most of my comments, I’ve thought I’ve identified it, and so I argued against it, but it seems I’m constantly wrong about that. So let me try and explicitly figure it out:
How much would you agree with each of these statements:
If there is a conflict between moral reasons and prudential reasons, you ought to do what the moral reasons say.
If it is an empirical fact about the universe that, independent of humans, there is a process for determining what actions one ought to take, then you ought to do what that process prescribes, regardless of what you desire.
If it is an empirical fact about the universe that, independent of humans, there is a process for determining what actions to take to maximize utility, then you ought to do what that process prescribes, regardless of what you desire.
If there is an external-to-you entity satisfying property X that prescribes actions you should take, then you ought to do what it says, regardless of what you desire. (For what value of X would you agree with this statement?)
I have a very low credence that your proposed meta-normative rule would be true?
I also have a very low credence of that meta-normative rule. I meant to contrast it to the meta-normative rule “binding oughtness trumps regular oughtness”, which I interpreted as “moral reasons trump prudential reasons”, but it seems I misunderstood what you meant there, since you mean “binding oughtness” to apply both to moral and prudential reasons, so ignore that argument.
I agree, my view stems from a bedrock of intuition, that just as the descriptive fact that ‘my table has four legs’ won’t create normative reasons for action, neither will the descriptive fact that ‘Harry desires chocolate ice-cream’ create them either.
This makes me mildly worried that you aren’t able to imagine the worldview where prudential reasons exist. Though I have to admit I’m confused why under this view there are any normative reasons for action—surely all such reasons depend on descriptive facts? Even with religions, you are basing your normative reasons for action upon descriptive facts about the religion.
(Btw, random note, I suspect that Ben Pace above and I have very similar views, so you can probably take your understanding of his view and apply it to mine.)
I see, that makes sense, and I agree with it.
I and most other people (I’m pretty sure) wouldn’t chase the highest probability of infinite utility, since most of those scenarios are also highly implausible and feel very similar to Pascal’s mugging.
However these just wouldn’t constitute normative reason for action and that’s just what you need for an action to be choice-worthy.
As I don’t think that mere desires create reasons for action I think we can ignore them unless they are actually prudential reasons.
I don’t know how to argue against this, you seem to be taking it as axiomatic. The one thing I can say is that it seems clearly obvious to me that your desires and goals can make some actions better to choose than others. It only becomes non-obvious if you expect there to be some external-to-you force telling you how to choose actions, but I see no reason to assume that. It really is fine if you’re actions aren’t guided by some overarching rule granted authority by virtue of being morality.
But I suspect this isn’t going to convince you. Can we simply assume that prudential reasons exist and figure out the implications?
The distinction between normative/prudential is one developed in the relevant literature, see this abstract for a paper by Roger Crisp to get a sense for it.
Thanks, I think I’ve got it now. (Also it seems to be in your appendix, not sure how I missed that before.)
The issues is that we’re trying to work out how to act with uncertainty about what sort of world we’re in?
I know, and I think in the very next paragraph I try to capture your view, and I’m fairly confident I got it right based on your comment.
However, it seems jarring to think that a person who does what there is most moral reason to do could have failed to do what there was most, all things considered, reason for them to do.
This seems tautological when you define morality as “binding oughtness” and compare against regular oughtness (which presumably applies to prudential reasons). But why stop there? Why not go to metamorality, or “binding meta-oughtness” that trumps “binding oughtness”? For example, “when faced with uncertainty over ought statements, choose the one that most aligns with prudential reasons”.
It is again tautologically true that a person who does what there is most metamoral reason to do could not have failed to do what there was most all things considered reason for them to do. It doesn’t sound as compelling, but I claim that is because we don’t have metamorality as an intuitive concept, whereas we do have morality as an intuitive concept.
With that terminology, I think your argument is that we should ignore worlds without a binding oughtness. But in worlds without a binding oughtness, you still have your own desires and goals to guide your actions. This might be what you call ‘prudential’ reasons, but I don’t really understand that term—I thought it was synonymous with ‘instrumental’ reasons, but taking actions for your own desires and goals is certainly not ‘instrumental’.
So it seems to me that in worlds with a binding oughtness that you know about, you should take actions according to that binding oughtness, and otherwise you should take actions according to your own desires and goals.
You could argue that binding oughtness always trumps desires and goals, so that your action should always follow the binding oughtness that is most likely, and you can put no weight on desires and goals. But I would want to know why that’s true.
Like, I could also argue that actually, you should follow the binding meta-oughtness rule, which tells you how to derive ought statements from is statements, and that should always trump any particular oughtness rule, so you should ignore all of those and follow the most likely meta-oughtness rule. But this seems pretty fallacious. What’s the difference?