This is the first in what might become a bunch of posts picking out issues from statistics and probability of relevance to EA. The format will be informal and fairly bite-size. None of this will be original, hopefully.
Expectations are not outcomes
Here we attempt to trim back the intuition that an expected value can be safely thought of as a representative value of the random variable.
Situation 1
A Rademacher random variable X takes the value 1 with probability 1⁄2 and otherwise −1. Its expectation is zero. We will almost surely never see any value other than −1 or 1.
This means that the expected value might not even be a number the distribution could produce. We might not even be able to get arbitrarily close to it.
Imagine walking up to a table in a casino and betting that the next roll of a die will be 7⁄2.
Situation 2
Researchers create a natural language simulation model. Upon receiving a piece of text as stimulus it outputs a random short story. What is the expectation of the story?
Let’s think about the first word. There will be some implied probability distribution over a dictionary. Its expectation is some fractional combination of every word in the dictionary. Whatever that means, and whatever it is useful for, it is not the start of a legible story—and should not be used as such.
What is the expected length of the story? What would a solution to that problem mean? Could one, for example, print the expected story?
Situation 3
Distributions with very fat tails. For instance, the Cauchy distribution has an undefined expectation.
Implication
It is tempting to freely substitute an expectation in as a representative of a random variable. Suppose we used the following procedure in a blanket fashion:
We are faced with a decision depending on an uncertain outcome.
We take the expected value of the outcome.
We use the expectation as a scenario to plan around.
Step three is unsafe in principle—even if sometimes not in practice.
If there is a next time (the length of this series is currently fractional) I hope to touch on some scenarios less easily dismissed as the concerns of a pedant.
[Stats4EA] Expectations are not Outcomes
This is the first in what might become a bunch of posts picking out issues from statistics and probability of relevance to EA. The format will be informal and fairly bite-size. None of this will be original, hopefully.
Expectations are not outcomes
Here we attempt to trim back the intuition that an expected value can be safely thought of as a representative value of the random variable.
Situation 1
A Rademacher random variable X takes the value 1 with probability 1⁄2 and otherwise −1. Its expectation is zero. We will almost surely never see any value other than −1 or 1.
This means that the expected value might not even be a number the distribution could produce. We might not even be able to get arbitrarily close to it.
Imagine walking up to a table in a casino and betting that the next roll of a die will be 7⁄2.
Situation 2
Researchers create a natural language simulation model. Upon receiving a piece of text as stimulus it outputs a random short story. What is the expectation of the story?
Let’s think about the first word. There will be some implied probability distribution over a dictionary. Its expectation is some fractional combination of every word in the dictionary. Whatever that means, and whatever it is useful for, it is not the start of a legible story—and should not be used as such.
What is the expected length of the story? What would a solution to that problem mean? Could one, for example, print the expected story?
Situation 3
Distributions with very fat tails. For instance, the Cauchy distribution has an undefined expectation.
Implication
It is tempting to freely substitute an expectation in as a representative of a random variable. Suppose we used the following procedure in a blanket fashion:
We are faced with a decision depending on an uncertain outcome.
We take the expected value of the outcome.
We use the expectation as a scenario to plan around.
Step three is unsafe in principle—even if sometimes not in practice.
If there is a next time (the length of this series is currently fractional) I hope to touch on some scenarios less easily dismissed as the concerns of a pedant.