Comparing Utilities

Link post

Abram Dem­ski writ­ing a great sum­mary of differ­ent ap­proaches of ag­gre­gat­ing util­ities. Over­all by far the best short sum­mary of a bunch of stuff in bar­gain­ing the­ory and pub­lic choice the­ory I’ve seen. Some ex­cerpts:

(This is a ba­sic point about util­ity the­ory which many will already be fa­mil­iar with. I draw some non-ob­vi­ous con­clu­sions which may be of in­ter­est to you even if you think you know this from the ti­tle—but the main point is to com­mu­ni­cate the ba­sics. I’m post­ing it to the al­ign­ment fo­rum be­cause I’ve heard mi­s­un­der­stand­ings of this from some in the AI al­ign­ment re­search com­mu­nity.)

I will first give the ba­sic ar­gu­ment that the util­ity quan­tities of differ­ent agents aren’t di­rectly com­pa­rable, and a few im­por­tant con­se­quences of this. I’ll then spend the rest of the post dis­cussing what to do when you need to com­pare util­ity func­tions.

Utilities aren’t com­pa­rable.

Utility isn’t an or­di­nary quan­tity. A util­ity func­tion is a de­vice for ex­press­ing the prefer­ences of an agent.

Sup­pose we have a no­tion of out­come.* We could try to rep­re­sent the agent’s prefer­ences be­tween out­comes as an or­der­ing re­la­tion: if we have out­comes A, B, and C, then one pos­si­ble prefer­ence would be A<B<C.

How­ever, a mere or­der­ing does not tell us how the agent would de­cide be­tween gam­bles, ie, situ­a­tions giv­ing A, B, and C with some prob­a­bil­ity.

With just three out­comes, there is only one thing we need to know: is B closer to A or C, and by how much?

We want to con­struct a util­ity func­tion U() which rep­re­sents the prefer­ences. Let’s say we set U(A)=0 and U(C)=1. Then we can rep­re­sent B=G as U(B)=1/​2. If not, we would look for a differ­ent gam­ble which does equal B, and then set B’s util­ity to the ex­pected value of that gam­ble. By as­sign­ing real-num­bered val­ues to each out­come, we can fully rep­re­sent an agent’s prefer­ences over gam­bles. (As­sum­ing the VNM ax­ioms hold, that is.)

But the ini­tial choices U(A)=0 and U(C)=1 were ar­bi­trary! We could have cho­sen any num­bers so long as U(A)<U(C), re­flect­ing the prefer­ence A<C. In gen­eral, a valid rep­re­sen­ta­tion of our prefer­ences U() can be mod­ified into an equally valid U’() by adding/​sub­tract­ing ar­bi­trary num­bers, or mul­ti­ply­ing/​di­vid­ing by pos­i­tive num­bers.


Var­i­ance Nor­mal­iza­tion: Not Too Ex­ploitable?

We could set the con­stants any way we want… to­tally sub­jec­tive es­ti­mates of the worth of a per­son, draw ran­dom lots, etc. But we do typ­i­cally want to rep­re­sent some no­tion of fair­ness. We said in the be­gin­ning that the prob­lem was, a util­ity func­tion has many equiv­a­lent rep­re­sen­ta­tions . We can ad­dress this as a prob­lem of nor­mal­iza­tion: we want to take a and put it into a canon­i­cal form, get­ting rid of the choice be­tween equiv­a­lent rep­re­sen­ta­tions.

One way of think­ing about this is strat­egy-proof­ness. A util­i­tar­ian col­lec­tive should not be vuln­er­a­ble to mem­bers strate­gi­cally claiming that their prefer­ences are stronger (larger ), or that they should get more be­cause they’re worse off than ev­ery­one (smaller -- al­though, re­mem­ber that we haven’t talked about any setup which ac­tu­ally cares about that, yet).

Warm-Up: Range Normalization

Un­for­tu­nately, some ob­vi­ous ways to nor­mal­ize util­ity func­tions are not go­ing to be strat­egy-proof.

One of the sim­plest nor­mal­iza­tion tech­niques is to squish ev­ery­thing into a speci­fied range, such as [0,1]:

This is analo­gous to range vot­ing: ev­ery­one re­ports their prefer­ences for differ­ent out­comes on a fixed scale, and these all get summed to­gether in or­der to make de­ci­sions.

If you’re an agent in a col­lec­tive which uses range nor­mal­iza­tion, then you may want to strate­gi­cally mis-re­port your prefer­ences. In the ex­am­ple shown, the agent has a big hump around out­comes they like, and a small hump on a sec­ondary “just OK” out­come. The agent might want to get rid of the sec­ond hump, forc­ing the group out­come into the more fa­vored re­gion.

I be­lieve that in the ex­treme, the op­ti­mal strat­egy for range vot­ing is to choose some util­ity thresh­old. Any­thing be­low that thresh­old goes to zero, feign­ing max­i­mal dis­ap­proval of the out­come. Any­thing above the thresh­old goes to one, feign­ing max­i­mal ap­proval. In other words, un­der strate­gic vot­ing, range vot­ing be­comes ap­proval vot­ing (range vot­ing where the only op­tions are zero and one).

If it’s not pos­si­ble to mis-re­port your prefer­ences, then the in­cen­tive be­comes to self-mod­ify to liter­ally have these ex­treme prefer­ences. This could per­haps have a real-life analogue in poli­ti­cal out­rage and black-and-white think­ing. If we use this nor­mal­iza­tion scheme, that’s the clos­est you can get to be­ing a util­ity mon­ster.

Var­i­ance Normalization

We’d like to avoid any in­cen­tive to mis­rep­re­sent/​mod­ify your util­ity func­tion. Is there a way to achieve that?

Owen Cot­ton-Bar­ratt dis­cusses differ­ent nor­mal­iza­tion tech­niques in illu­mi­nat­ing de­tail, and ar­gues for var­i­ance nor­mal­iza­tion: di­vide util­ity func­tions by their var­i­ance, mak­ing the var­i­ance one. (Geo­met­ric rea­sons for nor­mal­iz­ing var­i­ance to ag­gre­gate prefer­ences, O Cot­ton-Bar­ratt, 2013.) Var­i­ance nor­mal­iza­tion is strat­egy-proof un­der the as­sump­tion that ev­ery­one par­ti­ci­pat­ing in an elec­tion shares be­liefs about how prob­a­ble the differ­ent out­comes are! (Note that var­i­ance of util­ity is only well-defined un­der some as­sump­tion about prob­a­bil­ity of out­come.) That’s pretty good. It’s prob­a­bly the best we can get, in terms of strat­egy-proof­ness of vot­ing. Will MacAskill also ar­gues for var­i­ance nor­mal­iza­tion in the con­text of nor­ma­tive un­cer­tainty (Nor­ma­tive Uncer­tainty, Will MacAskill, 2014).

In­tu­itively, var­i­ance nor­mal­iza­tion di­rectly ad­dresses the is­sue we en­coun­tered with range nor­mal­iza­tion: an in­di­vi­d­ual at­tempts to make their prefer­ences “loud” by ex­trem­iz­ing ev­ery­thing to 0 or 1. This in­creases var­i­ance, so, is di­rectly pun­ished by var­i­ance nor­mal­iza­tion.

How­ever, Jame­son Quinn, LessWrong’s res­i­dent vot­ing the­ory ex­pert, has warned me rather strongly about var­i­ance nor­mal­iza­tion.

  1. The as­sump­tion of shared be­liefs about elec­tion out­comes is far from true in prac­tice. Jame­son Quinn tells me that, in fact, the strate­gic vot­ing in­cen­tivized by quadratic vot­ing is par­tic­u­larly bad amongst nor­mal­iza­tion tech­niques.

  2. Strat­egy-proof­ness isn’t, af­ter all, the fi­nal ar­biter of the qual­ity of a vot­ing method. The fi­nal ar­biter should be some­thing like the util­i­tar­ian qual­ity of an elec­tion’s out­come. This ques­tion gets a bit weird and re­cur­sive in the cur­rent con­text, where I’m us­ing elec­tions as an anal­ogy to ask how we should define util­i­tar­ian out­comes. But the point still, to some ex­tent, stands.

I didn’t un­der­stand the full jus­tifi­ca­tion be­hind his point, but I came away think­ing that range nor­mal­iza­tion was prob­a­bly bet­ter in prac­tice. After all, it re­duces to ap­proval vot­ing, which is ac­tu­ally a pretty good form of vot­ing. But if you want to do the best we can with the state of vot­ing the­ory, Jame­son Quinn sug­gested 3-2-1 vot­ing. (I don’t think 3-2-1 vot­ing gives us any nice the­ory about how to com­bine util­ity func­tions, though, so it isn’t so use­ful for our pur­poses.)

Open Ques­tion: Is there a var­i­ant of var­i­ance nor­mal­iza­tion which takes differ­ing be­liefs into ac­count, to achieve strat­egy-proof­ness (IE hon­est re­port­ing of util­ity)?

Any­way, so much for nor­mal­iza­tion tech­niques. Th­ese tech­niques ig­nore the broader con­text. They at­tempt to be fair and even-handed in the way we choose the mul­ti­plica­tive and ad­di­tive con­stants. But we could also ex­plic­itly try to be fair and even-handed in the way we choose be­tween Pareto-op­ti­mal out­comes, as with this next tech­nique.

Nash Bar­gain­ing Solution

It’s im­por­tant to re­mem­ber that the Nash bar­gain­ing solu­tion is a solu­tion to the Nash bar­gain­ing prob­lem, which isn’t quite our prob­lem here. But I’m go­ing to gloss over that. Just imag­ine that we’re set­ting the so­cial choice func­tion through a mas­sive ne­go­ti­a­tion, so that we can ap­ply bar­gain­ing the­ory.

Nash offers a very sim­ple solu­tion, which I’ll get to in a minute. But first, a few words on how this solu­tion is de­rived. Nash pro­vides two seper­ate jus­tifi­ca­tions for his solu­tion. The first is a game-the­o­retic deriva­tion of the solu­tion as an es­pe­cially ro­bust Nash equil­ibrium. I won’t de­tail that here; I quite recom­mend his origi­nal pa­per (The Bar­gain­ing Prob­lem, 1950); but, just keep in mind that there is at least some rea­son to ex­pect self­ishly ra­tio­nal agents to hit upon this par­tic­u­lar solu­tion. The sec­ond, un­re­lated jus­tifi­ca­tion is an ax­io­matic one:

  1. In­var­i­ance to equiv­a­lent util­ity func­tions. This is the same mo­ti­va­tion I gave when dis­cussing nor­mal­iza­tion.

  2. Pareto op­ti­mal­ity. We’ve already dis­cussed this as well.

  3. In­de­pen­dence of Ir­rele­vant Alter­na­tives (IIA). This says that we shouldn’t change the out­come of bar­gain­ing by re­mov­ing op­tions which won’t ul­ti­mately get cho­sen any­way. This isn’t even tech­ni­cally one of the VNM ax­ioms, but it es­sen­tially is—the VNM ax­ioms are posed for bi­nary prefer­ences (a > b). IIA is the as­sump­tion we need to break down multi-choice prefer­ences to bi­nary choices. We can jus­tify IIA with a kind of money pump.

  4. Sym­me­try. This says that the out­come doesn’t de­pend on the or­der of the bar­gain­ers; we don’t pre­fer Player 1 in case of a tie, or any­thing like that.


Altru­is­tic agents.

Another puz­zling case, which I think needs to be han­dled care­fully, is ac­count­ing for the prefer­ences of al­tru­is­tic agents.

Let’s pro­ceed with a sim­plis­tic model where agents have “per­sonal prefer­ences” (prefer­ences which just have to do with them­selves, in some sense) and “cofrences” (co-prefer­ences; prefer­ences hav­ing to do with other agents).

Here’s an agent named Sandy:

Per­sonal Prefer­encesCofrences

The cofrences rep­re­sent co­effi­cients on other agent’s util­ity func­tions. Sandy’s prefer­ences are sup­posed to be un­der­stood as a util­ity func­tion rep­re­sent­ing Sandy’s per­sonal prefer­ences, plus a weighted sum of the util­ity func­tions of Alice, Bob, Cathy, and Den­nis. (Note that the weights can, hy­po­thet­i­cally, be nega­tive—for ex­am­ple, screw Bob.)

The first prob­lem is that util­ity func­tions are not com­pa­rable, so we have to say more be­fore we can un­der­stand what “weighted sum” is sup­posed to mean. But sup­pose we’ve cho­sen some util­ity nor­mal­iza­tion tech­nique. There are still other prob­lems.

No­tice that we can’t to­tally define Sandy’s util­ity func­tion un­til we’ve defined Alice’s, Bob’s, Cathy’s, and Den­nis’. But any of those four might have cofrences which in­volve Sandy, as well!


Aver­age util­i­tar­i­anism vs to­tal util­i­tar­i­anism.

Now that we have given some op­tions for util­ity com­par­i­son, can we use them to make sense of the dis­tinc­tion be­tween av­er­age util­i­tar­i­anism and to­tal util­i­tar­i­anism?

No. Utility com­par­i­son doesn’t re­ally help us there.

The av­er­age vs to­tal de­bate is a de­bate about pop­u­la­tion ethics. Harsanyi’s util­i­tar­i­anism the­o­rem and re­lated ap­proaches let us think about al­tru­is­tic poli­cies for a fixed set of agents. They don’t tell us how to think about a set which changes over time, as new agents come into ex­is­tence.

Allow­ing the set to vary over time like this feels similar to al­low­ing a sin­gle agent to change its util­ity func­tion. There is no rule against this. An agent can pre­fer to have differ­ent prefer­ences than it does. A col­lec­tive of agents can pre­fer to ex­tend its al­tru­ism to new agents who come into ex­is­tence.

How­ever, I see no rea­son why pop­u­la­tion ethics needs to be sim­ple. We can have rel­a­tively com­plex prefer­ences here. So, I don’t find para­doxes such as the Repug­nant Con­clu­sion to be es­pe­cially con­cern­ing. To me there’s just this com­pli­cated ques­tion about what ev­ery­one col­lec­tively wants for the fu­ture.

One of the ba­sic ques­tions about util­i­tar­i­anism shouldn’t be “av­er­age vs to­tal?”. To me, this is a type er­ror. It seems to me, more ba­sic ques­tions for a (prefer­ence) util­i­tar­ian are: