I’ve never understood the bayesian logic of the anthropic shadow argument. I actually posted a question about this on the EA forum before, and didn’t get a good answer. I’d appreciate it if someone could help me figure out what I’m missing. When I write down the causal diagram for this situation, I can’t see how an anthropic shadow effect could be possible.

Section 2 of the linked paper shows that the probability of a catastrophic event having occurred in some time frame in the past given that we exist now: P(B_2|E), is smaller than its actual probability of occurring in that time frame, P. The two get more and more different the less likely we are to survive the catastrophic event (they call our probability of survival Q). It’s easy to understand why that is true. It is more likely that we would exist now if the event did not occur than if it did occur. In the extreme case where we are certain to be wiped out by the event, then P(B_2|E) = 0.

This means that if you re-ran the history of the world thousands of times, the ones with observers around at our time would have fewer catastrophic events in their past, on average, than is suggested by P. I am completely happy with this.

But the paper then leaps from this observation to the conclusion that our naive estimate of the frequency of catastrophic events (i.e. our estimate of P) must be biased downwards. This is the point where I lose the chain of reasoning. Here is why.

What we care about here is not P(B_2|E). What we care about is our estimate of P itself. We would ideally like to calculate the posterior distribution of P, given both B_1,2 (the occurrence/non-occurrence of the event in the past), and our existence, E. The causal diagram here looks like this:

P → B_2 → E

This diagram means: P influences B_2 (the catastrophic event occurring), which influences E (our existence). But P does not influence E except through B_2.

*This means if we condition on B_2, the fact we exist now should have no further impact on our estimate of P*

To sum up my confusion: The distribution of (P|B_2,E) should be equivalent to the distribution of (P|B_2). I.e., there is no anthropic shadow effect.

In my original EA forum question I took the messy anthropics out of it and imagined flipping a biased coin hundreds of times and painting a blue tile red with probability 1-Q (extinction) if we ever get a head. If we looked at the results of this experiment, we could estimate the bias of the coin by simply counting the number of heads. The colour of the tile is irrelevant. And we should go with the naive estimate, even though it is again true that people who see a blue tile will have fewer heads on average than is suggested by the bias of the coin.

What this observation about the tile frequencies misses is that the tile is more likely to be blue when the probability of heads is smaller (or we are more likely to exist if P is smaller), and we should take that into account too.

Overall it seems like our naive estimate of P based on the frequency of the catastrophic event in our past is totally fine when all things are considered.

I’m struggling at the moment to see why the anthropic case should be different to the coin case.

Can’t we imagine 100 people doing that experiment. People will get different results- some more heads than they “should” and some fewer heads than they “should.” But the sample means will cluster around the real rate of heads. So any observer won’t know if their result has too many heads or too few. So they go with their naive estimate.

With apocalypses, you know by definition you’re one of the observers that wasn’t wiped out. So I do think this reasoning works. If I’m wrong or my explanation makes no sense, please let me know!

If 100 people do the experiment, the ones who end up with a blue tile will, on average, have fewer heads than they should, for exactly the same reason that most observers will live after comparitively fewer catastrophic events.

But in the coin case that still does not mean that seeing a blue tile should make you revise your naive estimate upwards. The naive estimate is still, in bayesian terms, the correct one.

I don’t understand why the anthropic case is different.

In the tile case, the observers on average will be correct. Some will get too many heads, some few. But the observers on average will be correct. You won’t know whether you should adjust your personal estimate.

In the anthropic case, the observers on average will zero apocalypses no matter how common apocalypses are.

Imagine if in the tile case, everyone who was about to get more heads than average was killed by an assassin and the assassin told you what they were doing. Then when you did the experiment and lived, you would know your estimate was biased.

In the tile case, the observers who see a blue tile are underestimating on average. If you see a blue tile, you then know that you belong to that group, who are underestimating on average. But that still should not change your estimate. That’s weird and unintuitive, but true in the coin/tile case (unless I’ve got the maths badly wrong somewhere).

I get that there is a difference in the anthropic case. If you kill everyone with a red tile, then you’re right, the observers on average will be biased, because it’s only the observers with a blue tile who are left, and their estimates were biased to begin with. But what I don’t understand is, why is finding out that you are alive any different to finding out that your tile is blue? Shouldn’t the update be the same?

I can see that is a difference between the two cases. What I’m struggling to understand is why that leads to a different answer.

My understanding of the steps of the anthropic shadow argument (possibly flawed or incomplete) is something like this:

You are an observer → We should expect observers to underestimate the frequency of catastrophic events on average, if they use the frequency of catastrophic events in their past → You should revise your estimate of the frequency of catastrophic events upwards

But in the coin/tile case you could make an exactly analogous argument:

You see a blue tile → We should expect people who see a blue tile to underestimate the frequency of heads on average, if they use the frequency of heads in their past → You should revise your estimate of the frequency of heads upwards.

But in the coin/tile case, this argument is wrong, even though it appears intuitively plausible. If you do the full bayesian analysis, that argument leads you to the wrong answer. Why should we trust the argument of identical structure in the anthropic case?

I’ve never understood the bayesian logic of the anthropic shadow argument. I actually posted a question about this on the EA forum before, and didn’t get a good answer. I’d appreciate it if someone could help me figure out what I’m missing. When I write down the causal diagram for this situation, I can’t see how an anthropic shadow effect could be possible.

Section 2 of the linked paper shows that the probability of a catastrophic event having occurred in some time frame in the past

given that we exist now:P(B_2|E), is smaller than its actual probability of occurring in that time frame, P. The two get more and more different the less likely we are to survive the catastrophic event (they call our probability of survival Q). It’s easy to understand why that is true. It is more likely that we would exist now if the event did not occur than if it did occur. In the extreme case where we are certain to be wiped out by the event, then P(B_2|E) = 0.This means that if you re-ran the history of the world thousands of times, the ones with observers around at our time would have fewer catastrophic events in their past, on average, than is suggested by P. I am completely happy with this.

But the paper then leaps from this observation to the conclusion that our naive estimate of the frequency of catastrophic events (i.e. our estimate of P) must be biased downwards. This is the point where I lose the chain of reasoning. Here is why.

What we care about here is

notP(B_2|E). What we care about is our estimate of P itself. We would ideally like to calculate the posterior distribution ofP, given both B_1,2 (the occurrence/non-occurrence of the event in the past), and our existence, E. The causal diagram here looks like this:P → B_2 → E

This diagram means: P influences B_2 (the catastrophic event occurring), which influences E (our existence). But P does not influence E except through B_2.

*This means if we condition on B_2, the fact we exist now should have no further impact on our estimate of P*

To sum up my confusion: The distribution of (P|B_2,E) should be equivalent to the distribution of (P|B_2). I.e., there is no anthropic shadow effect.

In my original EA forum question I took the messy anthropics out of it and imagined flipping a biased coin hundreds of times and painting a blue tile red with probability 1-Q (extinction) if we ever get a head. If we looked at the results of this experiment, we could estimate the bias of the coin by simply counting the number of heads. The colour of the tile is irrelevant. And we should go with the naive estimate,

even though it is again true that people who see a blue tile will have fewer heads on average than is suggested by the bias of the coin.What this observation about the tile frequencies misses is that the tile is more likely to be blue when the probability of heads is smaller (or we are more likely to exist if P is smaller), and we should take that into account too.

Overall it seems like our naive estimate of P based on the frequency of the catastrophic event in our past is totally fine when all things are considered.

I’m struggling at the moment to see why the anthropic case should be different to the coin case.

Hi Toby,

Can’t we imagine 100 people doing that experiment. People will get different results- some more heads than they “should” and some fewer heads than they “should.” But the sample means will cluster around the real rate of heads. So any observer won’t know if their result has too many heads or too few. So they go with their naive estimate.

With apocalypses, you know by definition you’re one of the observers that wasn’t wiped out. So I do think this reasoning works. If I’m wrong or my explanation makes no sense, please let me know!

Thanks for your reply!

If 100 people do the experiment, the ones who end up with a blue tile will, on average, have fewer heads than they should, for exactly the same reason that most observers will live after comparitively fewer catastrophic events.

But in the coin case that still does not mean that seeing a blue tile should make you revise your naive estimate upwards. The naive estimate is still, in bayesian terms, the correct one.

I don’t understand why the anthropic case is different.

In the tile case, the observers on average will be correct. Some will get too many heads, some few. But the observers on average will be correct. You won’t know whether you should adjust your personal estimate.

In the anthropic case, the observers on average will zero apocalypses no matter how common apocalypses are.

Imagine if in the tile case, everyone who was about to get more heads than average was killed by an assassin and the assassin told you what they were doing. Then when you did the experiment and lived, you would know your estimate was biased.

In the tile case, the observers who see a blue tile are underestimating on average. If you see a blue tile, you then know that you belong to that group, who are underestimating on average. But that still should not change your estimate. That’s weird and unintuitive, but true in the coin/tile case (unless I’ve got the maths badly wrong somewhere).

I get that there is a difference in the anthropic case. If you kill everyone with a red tile, then you’re right, the observers on average will be biased, because it’s only the observers with a blue tile who are left, and their estimates were biased to begin with. But what I don’t understand is, why is finding out that you are alive any different to finding out that your tile is blue? Shouldn’t the update be the same?

No, because it’s possible you observe blue tile or red tile.

You observe things (alive) or don’t observe things (not alive.)

In the first situation, the observer knows multiple facts about the world could be observed. Not so in the second case.

I can see that is a difference between the two cases. What I’m struggling to understand is why that leads to a different answer.

My understanding of the steps of the anthropic shadow argument (possibly flawed or incomplete) is something like this:

You are an observer → We should expect observers to underestimate the frequency of catastrophic events on average, if they use the frequency of catastrophic events in their past → You should revise your estimate of the frequency of catastrophic events upwards

But in the coin/tile case you could make an exactly analogous argument:

You see a blue tile → We should expect people who see a blue tile to underestimate the frequency of heads on average, if they use the frequency of heads in their past → You should revise your estimate of the frequency of heads upwards.

But in the coin/tile case, this argument is wrong, even though it appears intuitively plausible. If you do the full bayesian analysis, that argument leads you to the wrong answer. Why should we trust the argument of identical structure in the anthropic case?