[Stats4EA] Expectations are not Outcomes

This is the first in what might become a bunch of posts picking out issues from statistics and probability of relevance to EA. The format will be informal and fairly bite-size. None of this will be original, hopefully.


Expectations are not outcomes


Here we attempt to trim back the intuition that an expected value can be safely thought of as a representative value of the random variable.


Situation 1


A Rademacher random variable X takes the value 1 with probability 12 and otherwise −1. Its expectation is zero. We will almost surely never see any value other than −1 or 1.


This means that the expected value might not even be a number the distribution could produce. We might not even be able to get arbitrarily close to it.

Imagine walking up to a table in a casino and betting that the next roll of a die will be 72.


Situation 2


Researchers create a natural language simulation model. Upon receiving a piece of text as stimulus it outputs a random short story. What is the expectation of the story?


Let’s think about the first word. There will be some implied probability distribution over a dictionary. Its expectation is some fractional combination of every word in the dictionary. Whatever that means, and whatever it is useful for, it is not the start of a legible story—and should not be used as such.


What is the expected length of the story? What would a solution to that problem mean? Could one, for example, print the expected story?


Situation 3


Distributions with very fat tails. For instance, the Cauchy distribution has an undefined expectation.


Implication


It is tempting to freely substitute an expectation in as a representative of a random variable. Suppose we used the following procedure in a blanket fashion:

  1. We are faced with a decision depending on an uncertain outcome.

  2. We take the expected value of the outcome.

  3. We use the expectation as a scenario to plan around.

Step three is unsafe in principle—even if sometimes not in practice.

If there is a next time (the length of this series is currently fractional) I hope to touch on some scenarios less easily dismissed as the concerns of a pedant.