# [Stats4EA] Expectations are not Outcomes

This is the first in what might be­come a bunch of posts pick­ing out is­sues from statis­tics and prob­a­bil­ity of rele­vance to EA. The for­mat will be in­for­mal and fairly bite-size. None of this will be origi­nal, hope­fully.

Ex­pec­ta­tions are not outcomes

Here we at­tempt to trim back the in­tu­ition that an ex­pected value can be safely thought of as a rep­re­sen­ta­tive value of the ran­dom vari­able.

Si­tu­a­tion 1

A Rademacher ran­dom vari­able X takes the value 1 with prob­a­bil­ity 12 and oth­er­wise −1. Its ex­pec­ta­tion is zero. We will al­most surely never see any value other than −1 or 1.

This means that the ex­pected value might not even be a num­ber the dis­tri­bu­tion could pro­duce. We might not even be able to get ar­bi­trar­ily close to it.

Imag­ine walk­ing up to a table in a cas­ino and bet­ting that the next roll of a die will be 72.

Si­tu­a­tion 2

Re­searchers cre­ate a nat­u­ral lan­guage simu­la­tion model. Upon re­ceiv­ing a piece of text as stim­u­lus it out­puts a ran­dom short story. What is the ex­pec­ta­tion of the story?

Let’s think about the first word. There will be some im­plied prob­a­bil­ity dis­tri­bu­tion over a dic­tio­nary. Its ex­pec­ta­tion is some frac­tional com­bi­na­tion of ev­ery word in the dic­tio­nary. What­ever that means, and what­ever it is use­ful for, it is not the start of a leg­ible story—and should not be used as such.

What is the ex­pected length of the story? What would a solu­tion to that prob­lem mean? Could one, for ex­am­ple, print the ex­pected story?

Si­tu­a­tion 3

Distri­bu­tions with very fat tails. For in­stance, the Cauchy dis­tri­bu­tion has an un­defined ex­pec­ta­tion.

Implication

It is tempt­ing to freely sub­sti­tute an ex­pec­ta­tion in as a rep­re­sen­ta­tive of a ran­dom vari­able. Sup­pose we used the fol­low­ing pro­ce­dure in a blan­ket fash­ion:

1. We are faced with a de­ci­sion de­pend­ing on an un­cer­tain out­come.

2. We take the ex­pected value of the out­come.

3. We use the ex­pec­ta­tion as a sce­nario to plan around.

Step three is un­safe in prin­ci­ple—even if some­times not in prac­tice.

If there is a next time (the length of this se­ries is cur­rently frac­tional) I hope to touch on some sce­nar­ios less eas­ily dis­missed as the con­cerns of a pedant.

• I’ve found Chris­tian Tarsney’s “Ex­ceed­ing Ex­pec­ta­tions” in­sight­ful when it comes to rec­og­niz­ing and maybe cop­ing with the limits of ex­pected value.

The prin­ci­ple that ra­tio­nal agents should max­i­mize ex­pected util­ity or choice­wor­thi­ness is in­tu­itively plau­si­ble in many or­di­nary cases of de­ci­sion-mak­ing un­der un­cer­tainty. But it is less plau­si­ble in cases of ex­treme, low-prob­a­bil­ity risk (like Pas­cal’s Mug­ging), and in­tol­er­ably para­dox­i­cal in cases like the St. Peters­burg and Pasadena games. In this pa­per I show that, un­der cer­tain con­di­tions, stochas­tic dom­i­nance rea­son­ing can cap­ture most of the plau­si­ble im­pli­ca­tions of ex­pec­ta­tional rea­son­ing while avoid­ing most of its pit­falls. Speci­fi­cally, given suffi­cient back­ground un­cer­tainty about the choice­wor­thi­ness of one’s op­tions, many ex­pec­ta­tion-max­i­miz­ing gam­bles that do not stochas­ti­cally dom­i­nate their al­ter­na­tives “in a vac­uum” be­come stochas­ti­cally dom­i­nant in virtue of that back­ground un­cer­tainty. But, even un­der these con­di­tions, stochas­tic dom­i­nance will not re­quire agents to ac­cept op­tions whose ex­pec­ta­tional su­pe­ri­or­ity de­pends on suffi­ciently small prob­a­bil­ities of ex­treme pay­offs. The sort of back­ground un­cer­tainty on which these re­sults de­pend looks un­avoid­able for any agent who mea­sures the choice­wor­thi­ness of her op­tions in part by the to­tal amount of value in the re­sult­ing world. At least for such agents, then, stochas­tic dom­i­nance offers a plau­si­ble gen­eral prin­ci­ple of choice un­der un­cer­tainty that can ex­plain more of the ap­par­ent ra­tio­nal con­straints on such choices than has pre­vi­ously been rec­og­nized.

See also the post/​se­quence by Daniel Koko­ta­jlo, “Tiny Prob­a­bil­ities of Vast Utilities”. I’m link­ing to the post that was most valuable to me, but by de­fault it might make sense to start with the first one in the se­quence. ^^

• Thanks—that last link was one I’d come across and liked when look­ing for pre­vi­ous cov­er­age. My sole pre­vi­ous blog post was about Pas­cal’s Wager. I’d found though when speak­ing about it that I was as­sum­ing too much for some of the au­di­ence I wanted to bring along; notwith­stand­ing my sloppy writ­ing :D So, I’m go­ing to at­tempt to stay fo­cused and in­cre­men­tal.

• Thanks for writ­ing this! It’s always use­ful to get re­minders for the sort of mis­takes we can fail to no­tice even if when they’re sig­nifi­cant.

I also think it would be a lot more helpful to walk through how this mis­take could hap­pen in some real sce­nar­ios in the con­text of EA (even though these sce­nar­ios would nat­u­rally be less clear-cut and more com­plex).

Lastly, it might be worth not­ing the many other tools we have to rep­re­sent ran­dom vari­ables. Some op­tions off the top of my head:

* Ex­pec­ta­tion & var­i­ance: Some­times use­ful for nor­mal dis­tri­bu­tions and other in­tu­itive dis­tri­bu­tions (eg QALY per \$ for many in­ter­ven­tions at scale).

* Con­fi­dence in­ter­vals: Use­ful for many cases where the re­sult is likely to be in a spe­cific range (eg effect size for a spe­cific treat­ment).

* Prob­a­bil­ities for spe­cific out­comes or events: Some­times use­ful for dis­tri­bu­tions with im­por­tant anoma­lies (eg im­pact of a new or­ga­ni­za­tion), or when look­ing for spe­cific com­bi­na­tions of mul­ti­ple dis­tri­bu­tions (eg the prob­a­bil­ity that AGI is com­ing soon and also that cur­rent al­ign­ment re­search is use­ful).

* Full model of the dis­tri­bu­tion: Some­times use­ful for sim­ple \ com­mon dis­tri­bu­tions (all the ex­am­ples that come to mind aren’t in the con­text of EA, oh well).

One small note: The ex­am­ples are there to make the cat­e­gory clearer. Th­ese aren’t all cases where ex­pected value is wrong \ in­ap­pro­pri­ate to use. Speci­fi­cally, for some of them, I think us­ing ex­pected value works great.

• I also think it would be a lot more helpful to walk through how this mis­take could hap­pen in some real sce­nar­ios in the con­text of EA

Hope­fully, we’ll get there! It’ll be mostly Bayesian though :)

• Thanks for writ­ing this. I hadn’t though it about this ex­plic­itly and think it’s use­ful. The bite-sized for­mat is great. A se­ries of posts would be great too.