I still don’t see the case for building earliness into our priors, rather than updating on the basis of finding oneself seemingly-early.
If we’re doing things right, it shouldn’t matter whether we’re building earliness into our prior or updating on the basis of earliness.
Let the set H=”the 1e10 (i.e. 10 billion) most influential people who will ever live” and let E=”the 1e11 (i.e. 100 billion) earliest people who will ever live”. Assume that the future will contain 1e100 people. Let X be a randomly sampled person.
For our unconditional prior P(X in H), everyone agrees that uniform probability is appropriate, i.e., P(X in H) = 1e-90. (I.e. we’re not giving up on the self-sampling assumption.)
However, for our belief over P(X in H | X in E), i.e. the probability that a randomly chosen early person is one of the most influential people, some people argue we should utilise an e.g. exponential function where earlier people are more likely to be influential (which could be called a prior over “X in H” based on how early X is). However, it seems like you’re saying that we shouldn’t assess P(X in H | X in E) directly from such a prior, but instead get it from bayesian updates. So lets do that.
P(X in H | X in E) = P(X in E | X in H) * P(X in H) / P(X in E) = P(X in E | X in H) * 1e-90 / 1e-89 = P(X in E | X in H) * 1e-1 = P(X in E | X in H) / 10
So now we’ve switched over to instead making a guess about P(X in E | X in H), i.e. the probability that one of the 1e10 most influential people also is one of the 1e11 earliest people, and dividing by 10. That doesn’t seem much easier than making a guess about P(X in H | X in E), and it’s not obvious whether our intuitions here would lead us to expect more or less influentialness.
Also, the way that 1e-90 and 1e-89 are both extraordinarily unlikely, but divide out to becoming 1e-1, illustrates Buck’s point:
if you condition on us being at an early time in human history (which is an extremely strong condition, because it has incredibly low prior probability), it’s not that surprising for us to find ourselves at a hingey time.
“If we’re doing things right, it shouldn’t matter whether we’re building earliness into our prior or updating on the basis of earliness.”
Thanks, Lukas, I thought this was very clear and exactly right.
“So now we’ve switched over to instead making a guess about P(X in E | X in H), i.e. the probability that one of the 1e10 most influential people also is one of the 1e11 earliest people, and dividing by 10. That doesn’t seem much easier than making a guess about P(X in H | X in E), and it’s not obvious whether our intuitions here would lead us to expect more or less influentialness.”
That’s interesting, thank you—this statement of the debate has helped clarify things for me. It does seem to me that doing the update - going via P(X in E | X in H) rather than directly trying to assess P(X in H | X in E) - is helpful, but I’d understand the position of someone who wanted just to assess P(X in H | X in E) directly.
I think it’s helpful to assess P(X in E | X in H) because it’s not totally obvious how one should update on the basis of earliness. The arrow of causality and the possibility of lock-in over time definitely gives reasons in favor of influential people being earlier. But there’s still the big question of how great an update that should be. And the cumulative nature of knowledge and understanding gives reasons in favor thinking that later people are more likely to be more influential.
This seems important to me because, for someone claiming that we should think that we’re at the HoH, the update on the basis of earliness is doing much more work than updates on the basis of, say, familiar arguments about when AGI is coming and what will happen when it does. To me at least, that’s a striking fact and wouldn’t have been obvious before I started thinking about these things.
This seems important to me because, for someone claiming that we should think that we’re at the HoH, the update on the basis of earliness is doing much more work than updates on the basis of, say, familiar arguments about when AGI is coming and what will happen when it does. To me at least, that’s a striking fact and wouldn’t have been obvious before I started thinking about these things.
It seems to me the object level is where the action is, and the non-simulation Doomsday Arguments mostly raise a phantom consideration that cancels out (in particular, cancelling out re whether there is an influenceable lock-in event this century).
You could say a similar thing about our being humans rather than bacteria, which cumulatively outnumber us by more than 1,000,000,000,000,000,000,000,000 times on Earth thus far according to the paleontologists.
Or you could go further and ask why we aren’t neutrinos? There are more than 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 of them in the observable universe.
However extravagant the class you pick, it’s cancelled out by the knowledge that we find ourselves in our current situation. I think it’s more confusing than helpful to say that our being humans rather than neutrinos is doing more than 10^70 times as much work as object-level analysis of AI in the case for attending to x-risk/lock-in with AI. You didn’t need to think about that in the first place to understand AI or bioweapons, it was an irrelevant distraction.
The same is true for future populations that know they’re living in intergalactic societies and the like. If we compare possible world A, where future Dyson spheres can handle a population of P (who know they’re in that era), and possible world B, where future Dyson spheres can support a population of 2P, they don’t give us much different expectations of the number of people finding themselves in our circumstances, and so cancel out.
The simulation argument (or a brain-in-vats story or the like) is different and doesn’t automatically cancel out because it’s a way to make our observations more likely and common. However, for policy it does still largely cancel out, as long as the total influence of people genuinely in our apparent circumstances is a lot greater than that of all simulations with apparent circumstances like ours: a bigger future world means more influence for genuine inhabitants of important early times and also more simulations. [But our valuation winds up being bounded by our belief about the portion of all-time resources allocated to sims in apparent positions like ours.]
Another way of thinking about this is that prior to getting confused by any anthropic updating, if you were going to set a policy for humans who find ourselves in our apparent situation across nonanthropic possibilities assessed at the object level (humanity doomed, Time of Perils, early lock-in, no lock-in), you would just want to add up the consequences of the policy across genuine early humans and sims in each (non-anthropically assessed) possible world.
A vast future gives more chances for influence on lock-in later, which might win out as even bigger than this century (although this gets rapidly less likely with time and expansion), but it shouldn’t change our assessment of lock-in this century, and a substantial chance of that gives us a good chance of HoH (or simulation-adjusted HoH).
One way to frame this is that we do need extraordinarily strong evidence to update from thinking that we’re almost certainly not the most influential time to thinking that we might plausibly be the most influential time. However, we don’t need extraordinarily strong evidence pointing towards us almost certainly being the most influential (that then “averages out” to thinking that we’re plausibly the most influential). It’s sufficient to get extraordinarily strong evidence that we are at a point in history which is plausibly the most influential. And if we condition on the future being long and that we aren’t in a simulation (because that’s probably when we have the most impact), we do in fact have extraordinarily strong evidence that we are very early in history, which is a point that’s plausibly the most influential.
The question which seems important to me now is: does Will think that the probability of high influentialness conditional on birth rank (but before accounting for any empirical knowledge) is roughly the same as the negative exponential distribution Toby discussed in the comments on his original post?
I actually think the negative exponential gives too little weight to later people, because I’m not certain that late people can’t be influential. But if I had a person from the first 1e-89 of all people who’ve ever lived and a random person from the middle, I’d certainly say that the former was more likely to be one of the most influential people. They’d also be more likely to be one of the least influential people! Their position is just so special!
Maybe my prior would be like 30% to a uniform function, 40% to negative exponentials of various slopes, and 30% to other functions (e.g. the last person who ever lived seems more likely to be the most influential than a random person in the middle.)
Only using a single, simple function for something so complicated seems overconfident to me. And any mix of functions where one of them assigns decent probability to early people being the most influential is enough that it’s not super unlikely that early people are the most influential.
“Only using a single, simple function for something so complicated seems overconfident to me. And any mix of functions where one of them assigns decent probability to early people being the most influential is enough that it’s not super unlikely that early people are the most influential.”
I strongly agree with this. The fact that under a mix of distributions, it becomes not super unlikely that early people are the most influential, is really important and was somewhat buried in the original comments-discussion.
And then we’re also very distinctive in other ways: being on one planet, being at such a high-growth period, etc.
I agree that our earliness gives a dramatic update in favor of us being influential. I don’t have a stable view on the magnitude of that.
I’m not convinced that the negative exponential form of Toby’s distribution is the right one, but I don’t have any better suggestions
Like Lukas, I think that Toby’s distribution gives too much weight to early people, so the update I would make is less dramatic than Toby’s
Seeing as Toby’s prior is quite sensitive to choice of reference-class, I would want to choose the reference class of all observer-moments, where an observer is a conscious being. This means we’re not as early as we would say if we used the distribution of Homo sapiens, or of hominids. I haven’t thought about what exactly that means, though my intuition is that it means the update isn’t nearly as big.
So I guess the answer to your question is ‘no’: our earliness is an enormous update, but not as big as Toby would suggest.
If we’re doing things right, it shouldn’t matter whether we’re building earliness into our prior or updating on the basis of earliness.
Let the set H=”the 1e10 (i.e. 10 billion) most influential people who will ever live” and let E=”the 1e11 (i.e. 100 billion) earliest people who will ever live”. Assume that the future will contain 1e100 people. Let X be a randomly sampled person.
For our unconditional prior P(X in H), everyone agrees that uniform probability is appropriate, i.e., P(X in H) = 1e-90. (I.e. we’re not giving up on the self-sampling assumption.)
However, for our belief over P(X in H | X in E), i.e. the probability that a randomly chosen early person is one of the most influential people, some people argue we should utilise an e.g. exponential function where earlier people are more likely to be influential (which could be called a prior over “X in H” based on how early X is). However, it seems like you’re saying that we shouldn’t assess P(X in H | X in E) directly from such a prior, but instead get it from bayesian updates. So lets do that.
P(X in H | X in E) = P(X in E | X in H) * P(X in H) / P(X in E) = P(X in E | X in H) * 1e-90 / 1e-89 = P(X in E | X in H) * 1e-1 = P(X in E | X in H) / 10
So now we’ve switched over to instead making a guess about P(X in E | X in H), i.e. the probability that one of the 1e10 most influential people also is one of the 1e11 earliest people, and dividing by 10. That doesn’t seem much easier than making a guess about P(X in H | X in E), and it’s not obvious whether our intuitions here would lead us to expect more or less influentialness.
Also, the way that 1e-90 and 1e-89 are both extraordinarily unlikely, but divide out to becoming 1e-1, illustrates Buck’s point:
“If we’re doing things right, it shouldn’t matter whether we’re building earliness into our prior or updating on the basis of earliness.”
Thanks, Lukas, I thought this was very clear and exactly right.
“So now we’ve switched over to instead making a guess about P(X in E | X in H), i.e. the probability that one of the 1e10 most influential people also is one of the 1e11 earliest people, and dividing by 10. That doesn’t seem much easier than making a guess about P(X in H | X in E), and it’s not obvious whether our intuitions here would lead us to expect more or less influentialness.”
That’s interesting, thank you—this statement of the debate has helped clarify things for me. It does seem to me that doing the update - going via P(X in E | X in H) rather than directly trying to assess P(X in H | X in E) - is helpful, but I’d understand the position of someone who wanted just to assess P(X in H | X in E) directly.
I think it’s helpful to assess P(X in E | X in H) because it’s not totally obvious how one should update on the basis of earliness. The arrow of causality and the possibility of lock-in over time definitely gives reasons in favor of influential people being earlier. But there’s still the big question of how great an update that should be. And the cumulative nature of knowledge and understanding gives reasons in favor thinking that later people are more likely to be more influential.
This seems important to me because, for someone claiming that we should think that we’re at the HoH, the update on the basis of earliness is doing much more work than updates on the basis of, say, familiar arguments about when AGI is coming and what will happen when it does. To me at least, that’s a striking fact and wouldn’t have been obvious before I started thinking about these things.
It seems to me the object level is where the action is, and the non-simulation Doomsday Arguments mostly raise a phantom consideration that cancels out (in particular, cancelling out re whether there is an influenceable lock-in event this century).
You could say a similar thing about our being humans rather than bacteria, which cumulatively outnumber us by more than 1,000,000,000,000,000,000,000,000 times on Earth thus far according to the paleontologists.
Or you could go further and ask why we aren’t neutrinos? There are more than 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 of them in the observable universe.
However extravagant the class you pick, it’s cancelled out by the knowledge that we find ourselves in our current situation. I think it’s more confusing than helpful to say that our being humans rather than neutrinos is doing more than 10^70 times as much work as object-level analysis of AI in the case for attending to x-risk/lock-in with AI. You didn’t need to think about that in the first place to understand AI or bioweapons, it was an irrelevant distraction.
The same is true for future populations that know they’re living in intergalactic societies and the like. If we compare possible world A, where future Dyson spheres can handle a population of P (who know they’re in that era), and possible world B, where future Dyson spheres can support a population of 2P, they don’t give us much different expectations of the number of people finding themselves in our circumstances, and so cancel out.
The simulation argument (or a brain-in-vats story or the like) is different and doesn’t automatically cancel out because it’s a way to make our observations more likely and common. However, for policy it does still largely cancel out, as long as the total influence of people genuinely in our apparent circumstances is a lot greater than that of all simulations with apparent circumstances like ours: a bigger future world means more influence for genuine inhabitants of important early times and also more simulations. [But our valuation winds up being bounded by our belief about the portion of all-time resources allocated to sims in apparent positions like ours.]
Another way of thinking about this is that prior to getting confused by any anthropic updating, if you were going to set a policy for humans who find ourselves in our apparent situation across nonanthropic possibilities assessed at the object level (humanity doomed, Time of Perils, early lock-in, no lock-in), you would just want to add up the consequences of the policy across genuine early humans and sims in each (non-anthropically assessed) possible world.
A vast future gives more chances for influence on lock-in later, which might win out as even bigger than this century (although this gets rapidly less likely with time and expansion), but it shouldn’t change our assessment of lock-in this century, and a substantial chance of that gives us a good chance of HoH (or simulation-adjusted HoH).
One way to frame this is that we do need extraordinarily strong evidence to update from thinking that we’re almost certainly not the most influential time to thinking that we might plausibly be the most influential time. However, we don’t need extraordinarily strong evidence pointing towards us almost certainly being the most influential (that then “averages out” to thinking that we’re plausibly the most influential). It’s sufficient to get extraordinarily strong evidence that we are at a point in history which is plausibly the most influential. And if we condition on the future being long and that we aren’t in a simulation (because that’s probably when we have the most impact), we do in fact have extraordinarily strong evidence that we are very early in history, which is a point that’s plausibly the most influential.
The question which seems important to me now is: does Will think that the probability of high influentialness conditional on birth rank (but before accounting for any empirical knowledge) is roughly the same as the negative exponential distribution Toby discussed in the comments on his original post?
I actually think the negative exponential gives too little weight to later people, because I’m not certain that late people can’t be influential. But if I had a person from the first 1e-89 of all people who’ve ever lived and a random person from the middle, I’d certainly say that the former was more likely to be one of the most influential people. They’d also be more likely to be one of the least influential people! Their position is just so special!
Maybe my prior would be like 30% to a uniform function, 40% to negative exponentials of various slopes, and 30% to other functions (e.g. the last person who ever lived seems more likely to be the most influential than a random person in the middle.)
Only using a single, simple function for something so complicated seems overconfident to me. And any mix of functions where one of them assigns decent probability to early people being the most influential is enough that it’s not super unlikely that early people are the most influential.
“Only using a single, simple function for something so complicated seems overconfident to me. And any mix of functions where one of them assigns decent probability to early people being the most influential is enough that it’s not super unlikely that early people are the most influential.”
I strongly agree with this. The fact that under a mix of distributions, it becomes not super unlikely that early people are the most influential, is really important and was somewhat buried in the original comments-discussion.
And then we’re also very distinctive in other ways: being on one planet, being at such a high-growth period, etc.
Thanks, I agree that this is key. My thoughts:
I agree that our earliness gives a dramatic update in favor of us being influential. I don’t have a stable view on the magnitude of that.
I’m not convinced that the negative exponential form of Toby’s distribution is the right one, but I don’t have any better suggestions
Like Lukas, I think that Toby’s distribution gives too much weight to early people, so the update I would make is less dramatic than Toby’s
Seeing as Toby’s prior is quite sensitive to choice of reference-class, I would want to choose the reference class of all observer-moments, where an observer is a conscious being. This means we’re not as early as we would say if we used the distribution of Homo sapiens, or of hominids. I haven’t thought about what exactly that means, though my intuition is that it means the update isn’t nearly as big.
So I guess the answer to your question is ‘no’: our earliness is an enormous update, but not as big as Toby would suggest.