How to Measure Capacity for Welfare and Moral Status

Ex­ec­u­tive Summary

An an­i­mal’s ca­pac­ity for welfare is how good or bad its life can go. An an­i­mal’s moral sta­tus is the de­gree to which an an­i­mal’s ex­pe­riences or in­ter­ests mat­ter morally. It’s plau­si­ble that an­i­mals differ in their ca­pac­ity for welfare and/​or their moral sta­tus. Th­ese differ­ences could af­fect the way we ought to al­lo­cate re­sources across in­ter­ven­tions and/​or cause ar­eas. Un­for­tu­nately, mea­sur­ing ca­pac­ity for welfare and moral sta­tus is tremen­dously difficult.

When donors or re­searchers choose to fo­cus on cause ar­eas or in­ter­ven­tions that tar­get cer­tain species rather than oth­ers, they are of­ten im­plic­itly mak­ing judg­ments about the com­par­a­tive value of differ­ent an­i­mals (in­clud­ing hu­mans). Without a model for quan­tify­ing differ­ences in com­par­a­tive value, such judg­ments are apt to be guided by im­perfect and likely un­re­li­able heuris­tics.

There are two non-ex­clu­sive meth­ods we might em­ploy to mea­sure ca­pac­ity for welfare and moral sta­tus. The first method is to sur­vey var­i­ous ex­perts about what sort of trade­offs among an­i­mals they would en­dorse. This ap­proach is rel­a­tively sim­ple and cheap, but it re­lies on the as­sump­tion that in­tu­itions about moral trade­offs re­li­ably track the moral truth. This as­sump­tion looks du­bi­ous. In­tu­itive judg­ments of this kind are of­ten sen­si­tive to non-ev­i­den­tial fac­tors. Deep-rooted, wide­spread speciesism is likely to prej­u­dice re­sponses.

The sec­ond method is more time-con­sum­ing and com­plex but po­ten­tially more ob­jec­tive. The method pro­ceeds in three steps. The first step is to can­vass the rele­vant philo­soph­i­cal liter­a­ture to gen­er­ate a rel­a­tively the­ory-neu­tral list of char­ac­ter­is­tics that might con­tribute to ca­pac­ity for welfare or moral sta­tus. The sec­ond step is to find em­piri­cally mea­surable prox­ies for those char­ac­ter­is­tics and weight the prox­ies by their rel­a­tive im­por­tance. The third step is to can­vass the rele­vant sci­en­tific liter­a­ture to score differ­ent an­i­mals of in­ter­est ac­cord­ing to the fea­tures iden­ti­fied in the sec­ond step. Es­ti­mates of un­cer­tainty would be made at each step, and a sen­si­tivity anal­y­sis would help iden­tify ar­eas of high in­for­ma­tion value. I es­ti­mate that such a pro­ject would re­quire be­tween five thou­sand and seven thou­sand per­son-hours to com­plete.

In­tro­duc­tion and Context

This post is the sec­ond in Re­think Pri­ori­ties’ se­ries about com­par­ing ca­pac­ity for welfare and moral sta­tus across species. The pri­mary goal of this se­ries is to im­prove the way re­sources are al­lo­cated within the effec­tive an­i­mal ad­vo­cacy move­ment. A sec­ondary goal is to im­prove the al­lo­ca­tion of re­sources be­tween hu­man-fo­cused cause ar­eas and non­hu­man-an­i­mal-fo­cused cause ar­eas. In the first post I lay the con­cep­tual frame­work for the rest of the se­ries, out­lin­ing differ­ent the­o­ries of welfare and moral sta­tus and the re­la­tion­ship be­tween the two. In this, the sec­ond en­try in the se­ries, I pre­sent and eval­u­ate two method­olo­gies for mea­sur­ing and com­par­ing ca­pac­ity for welfare and moral sta­tus. In the third en­try in the se­ries, I ex­plain what the sub­jec­tive ex­pe­rience of time is, why it mat­ters, and why it’s plau­si­ble that there are morally sig­nifi­cant differ­ences in the sub­jec­tive ex­pe­rience of time across species. In the fourth en­try in the se­ries, I ex­plore crit­i­cal flicker-fu­sion fre­quency as a po­ten­tial proxy for the sub­jec­tive ex­pe­rience of time. In the fifth, sixth, and sev­enth en­tries in the se­ries, I in­ves­ti­gate vari­a­tion in the char­ac­ter­is­tic range of in­ten­sity of valenced ex­pe­rience across species.

The Mea­sure­ment Problem

Hu­mans ex­ploit a huge va­ri­ety of an­i­mals. On an an­nual ba­sis, hu­mans slaugh­ter about 290 mil­lion frogs, 480 mil­lion goats, 2.9 billion snails, 3 billion ducks, 22 billion cochineal bugs, 69 billion chick­ens, 300 billion crus­taceans, and nearly a trillion com­mer­cially caught fish.[1] At any given time, hu­mans con­fine about 251 mil­lion sheep, 265 mil­lion cows, 7.5 billion hens, and more than 1.4 trillion bees to pro­duce wool, milk, eggs, and honey. Count­ing some­what con­ser­va­tively, hu­mans ex­ploit at least 33 or­ders of an­i­mals, across 13 classes and 6 phyla.[2] The effec­tive an­i­mal ad­vo­cacy (EAA) move­ment has limited re­sources, and it must choose how to al­lo­cate these scarce re­sources among these differ­ent an­i­mals, most of whom are treated mis­er­ably by hu­mans. Since we can’t (yet) help all these an­i­mals, we must de­cide which an­i­mals to pri­ori­tize.[3]

In the first en­try in this se­ries, I ar­gued that there are good rea­sons to think that an­i­mals differ in their ca­pac­ity for welfare and/​or their moral sta­tus. I also claimed that these differ­ences could sig­nifi­cantly af­fect the way we ought to al­lo­cate re­sources across cause ar­eas and in­ter­ven­tions. Of course, these differ­ences can only af­fect our al­loca­tive de­ci­sion-mak­ing if we know about them and know, at least roughly, their mag­ni­tudes. Hence, it is im­por­tant that we de­vise a method for re­li­ably mea­sur­ing ca­pac­ity for welfare and moral sta­tus and com­par­ing them across species.[4]

Meth­ods for mea­sur­ing ca­pac­ity for welfare and moral sta­tus can be as­sessed across a num­ber of im­por­tant crite­ria. The method must be valid and ac­cu­rate—that is, it must ac­tu­ally track ca­pac­ity for welfare and moral sta­tus and be sen­si­tive to differ­ences in ca­pac­ity for welfare and moral sta­tus. Ideally, the method would be ap­pli­ca­ble across species—that is, it would be ac­cu­rate and valid with re­spect to phy­lo­ge­net­i­cally dis­tant an­i­mals oc­cu­py­ing rad­i­cally differ­ent ecolog­i­cal niches.[5] Ideally, the method would be sen­si­tive to moral un­cer­tainty—that is, rather than as­sume a par­tic­u­lar nor­ma­tive frame­work, the method would al­low one to in­put a va­ri­ety of plau­si­ble ax­iolog­i­cal as­sump­tions and ob­serve how chang­ing the as­sump­tions changes the out­puts of the fi­nal model. The prac­ti­cal fea­si­bil­ity of the method must also be con­sid­ered. How sim­ple is the method to ex­e­cute and use when finished? How much would ex­e­cut­ing the method cost? How likely is it that an at­tempt to ex­e­cute the method ends in failure?[6]

In this post I con­sider two meth­ods for mea­sur­ing and com­par­ing ca­pac­ity for welfare and moral sta­tus: (1) a holis­tic ap­proach, in which rele­vant ex­perts em­ploy their nor­ma­tive and biolog­i­cal ex­per­tise to make all-things-con­sid­ered es­ti­mates of the ap­pro­pri­ate trade­offs be­tween differ­ent lives, ex­pe­riences, or in­ter­ests, and (2) an atom­istic ap­proach, in which we iden­tify em­piri­cal prox­ies for morally salient fea­tures, then let our best sci­en­tific un­der­stand­ing of the de­gree to which differ­ent an­i­mals pos­sess those fea­tures guide our es­ti­mates of com­par­a­tive moral value. The two ap­proaches are not in prin­ci­ple mu­tu­ally ex­clu­sive. One could in the­ory adopt both ap­proaches, then let one’s fi­nal es­ti­mates be con­di­tioned by a weighted re­flec­tive equil­ibrium be­tween the two.

I ar­gue that the atom­istic ap­proach is the more difficult but ul­ti­mately the more ac­cu­rate method. Thus, any re­flec­tive equil­ibrium be­tween the two ap­proaches ought to be weighted more heav­ily to­ward the atom­istic rather than the holis­tic ap­proach. Nonethe­less, the atom­istic ap­proach faces se­ri­ous com­pli­ca­tions along at least three di­men­sions: iden­ti­fy­ing em­piri­cally-mea­surable prox­ies for the morally salient fea­tures, com­par­ing those prox­ies across phy­lo­ge­net­i­cally dis­tant an­i­mals, and in­cor­po­rat­ing differ­en­tial perfor­mance on those fea­tures into a unified, com­mon met­ric weighted by the im­por­tance of the fea­tures.

Of course, we shouldn’t ex­pect that we’ll ever be able to pin­point an an­i­mal’s pre­cise ca­pac­ity for welfare or moral sta­tus. As I de­tail later in this se­ries, there is a tremen­dous amount of em­piri­cal un­cer­tainty about the ex­tent to which differ­ent an­i­mals dis­play differ­ent morally rele­vant traits and fea­tures. And even if the em­piri­cal un­cer­tainty could be re­solved, the philo­soph­i­cal un­cer­tainty would likely re­main.[7] Thus, our best method­ol­ogy ex­e­cuted as well as we can will still de­liver only ranges of val­ues, and it’s difficult to say in ad­vance how wide those ranges will be. At­tempt­ing to mea­sure ca­pac­ity for welfare and moral sta­tus will help us iden­tify our de­gree of un­cer­tainty re­gard­ing these is­sues. Merely know­ing the ex­tent of our un­cer­tainty could plau­si­bly im­prove our de­ci­sion-mak­ing pro­cess.[8]

The Holis­tic Approach

Ac­cord­ing to what I’m call­ing the holis­tic ap­proach to mea­sur­ing ca­pac­ity for welfare and moral sta­tus, the best way to es­ti­mate moral sta­tus and ca­pac­ity for welfare is to think holis­ti­cally about the com­par­a­tive value of differ­ent sorts of an­i­mals. The ap­proach is holis­tic be­cause it starts at the ques­tion we are try­ing to an­swer rather than try­ing to de­com­pose the ques­tion into con­stituent parts. The holis­tic ap­proach elic­its all-things-con­sid­ered judg­ments about the rel­a­tive value of differ­ent an­i­mals, and there is no un­der­ly­ing frame­work which de­ter­mines which con­sid­er­a­tions ought to bear on the fi­nal judg­ments.

In­so­far as there is a cur­rently en­dorsed method for ad­ju­di­cat­ing dis­putes about the com­par­a­tive value of differ­ent an­i­mals, the holis­tic ap­proach seems to be the preferred method. As far as I can tell, in most or­ga­ni­za­tions de­ci­sions about the com­par­a­tive value of differ­ent an­i­mals are gov­erned by in­tu­itive judg­ments rather than ta­bles and spread­sheets.[9] This is prob­a­bly for the best. Ca­pac­ity for welfare and moral sta­tus are com­pli­cated top­ics; they don’t lend them­selves to easy for­mu­lae. Ex­plicit nu­mer­i­cal rep­re­sen­ta­tions are apt to gloss over im­por­tant com­plex­ity, and sim­ple quan­ti­ta­tive mod­els are un­likely to out­perform all-things-con­sid­ered judg­ments by do­main ex­perts. As we’ll see be­low, I es­ti­mate that ex­e­cut­ing the atom­istic ap­proach to the rough speci­fi­ca­tions I out­line would re­quire about six thou­sand per­son-hours, with lit­tle ac­tion-guid­ing pay­off un­til at least mid­way through the pro­ject. Since most or­ga­ni­za­tions don’t have three to six thou­sand hours to think about these is­sues, the holis­tic ap­proach is an ac­cept­able short-term stop­gap. In the medium- to long-term, the only way to en­sure that we are effi­ciently al­lo­cat­ing re­sources across differ­ent groups of an­i­mals is to in­vest the time and money nec­es­sary to thor­oughly study moral sta­tus and ca­pac­ity for welfare.

Trade­offs and preferences

One way to at­tempt to mea­sure com­par­a­tive moral value is by di­rectly judg­ing what sort of trade­offs be­tween differ­ent species would be ap­pro­pri­ate. The trade­offs might be couched in terms of lives: we might won­der how many salmon we ought to be will­ing to let die in or­der to save one thou­sand turkeys. The trade­offs might be couched in terms of ex­pe­riences: we might won­der how many min­utes of suffer­ing we ought to be will­ing to let a lob­ster en­dure in or­der to alle­vi­ate one hun­dred min­utes of frog suffer­ing. Or the trade­offs might be couched in terms of in­ter­ests: as­sum­ing pigs and chick­ens have an equally strong in­ter­est in avoid­ing ex­treme con­fine­ment, we might won­der how many hens we ought to be will­ing to forgo free­ing in or­der to liber­ate ten sows.

Another ap­proach is to couch the trade­offs in terms of what species one would pre­fer to be. For ex­am­ple, Peter Singer al­lows that “it would not nec­es­sar­ily be speciesist to rank the value of differ­ent lives in some hi­er­ar­chi­cal or­der­ing. How we should go about do­ing this is an­other ques­tion, and I have noth­ing bet­ter to offer than the imag­i­na­tive re­con­struc­tion of what it would be like to be a differ­ent kind of be­ing” (Singer 2011: 91).[10] He sug­gests that “If it is true that we can make sense of the choice be­tween ex­is­tence as a horse and ex­is­tence as a hu­man, then – whichever way the choice would go – we can make sense of the idea that the life of one kind of an­i­mal pos­sesses greater value than the life of an­other; and if this is so, then the claim that the life of ev­ery be­ing has equal value is on very weak ground” (Singer 2011: 91). With a suit­ably large and di­verse sam­ple of matched pairs, we could cre­ate an or­dered rank­ing.

We could also ask how many days of one’s hu­man life one would be will­ing to forgo to ex­pe­rience some du­ra­tion of time as an­other species. This ap­proach would al­low us to as­sign car­di­nal num­bers to the value of an­i­mal lives. Shelly Ka­gan imag­ines such an ap­proach. He writes, “The av­er­age hu­man life span is about 79 years, or more than 28,000 days. Di­vided by ten thou­sand that’s still more than 2.8 days. If, like me, you wouldn’t give up even a sin­gle day as a per­son for an en­tire ex­tra life­time as a fly, then you agree that the welfare to be had within a fly’s life is less than one ten thou­sandth the welfare to be found in a per­son’s life” (Ka­gan 2019: 90, fn 5).[11] Again, by con­sid­er­ing one’s preferred trade­offs across a large and di­verse sam­ple of differ­ent an­i­mals, we could be­gin to con­struct a hi­er­ar­chy of com­par­a­tive moral value.

Sur­vey data

There is already a wealth of ex­ist­ing sur­vey data about at­ti­tudes to an­i­mals, and some of this data can be re­pur­posed to in­fer the gen­eral pub­lic’s po­si­tions on the morally ap­pro­pri­ate trade­offs among species. The An­i­mal At­ti­tudes Scale (AAS) has been in use since 2002.[12] The AAS and its var­i­ants[13] ask re­spon­dents to agree or dis­agree (on a five point scale) with twenty-eight state­ments such as It is morally wrong to hunt wild an­i­mals just for sport and Breed­ing an­i­mals for their skins is a le­gi­t­i­mate use of an­i­mals. Some of the ques­tions re­fer to spe­cific an­i­mals, such as It is morally wrong to eat chicken and fish and A hu­man has no right to use a horse as a means of trans­porta­tion (rid­ing) or en­ter­tain­ment (rac­ing). Com­par­ing such re­sponses re­veals rough differ­ences in at­ti­tudes to­ward differ­ent an­i­mals, but it does not re­veal the de­gree to which some an­i­mals are val­ued more than oth­ers.

A new scale, the An­i­mal Pur­pose Ques­tion­naire (APQ), has re­cently been de­vised to offer “a more differ­en­ti­ated mea­sure of at­ti­tudes to an­i­mal use across a va­ri­ety of set­tings” (Bradley et al 2020: 1). The APQ asks re­spon­dents the ex­tent to which they agree (on a five point scale) that it’s per­mis­si­ble for an­i­mals to be kil­led for differ­ent pur­poses. In to­tal the APQ asks about six­teen an­i­mals[14] and five uses,[15] though the sur­vey is de­signed so that re­spon­dents aren’t asked about all an­i­mals and all uses. Gen­er­al­iz­ing across re­spon­dents and us­age cat­e­gories, Bradley et al. 2020 find that re­spon­dents tend to value mon­keys, bad­gers, tree shrews, chim­panzees, dogs, dolphins and par­rots more highly than rats, mice, pigs, oc­to­puses, chick­ens, ze­brafish, carp, and pi­geons. Again, though, the scale can­not pin­point the ex­act ex­tent to which some an­i­mals are val­ued more than oth­ers.

Another re­cent sur­vey, Miralles et al. 2019, asked re­spon­dents about their rel­a­tive lev­els of em­pa­thy and com­pas­sion for an­i­mals of differ­ent species. Each re­spon­dent was asked to view pic­tures of two an­i­mals of differ­ent species. For the em­pa­thy ques­tions, re­spon­dents chose the an­i­mal for which they felt like they were “bet­ter able to un­der­stand the feel­ings or emo­tions of.” For the com­pas­sion ques­tions, re­spon­dents chose which an­i­mal they would save if both were in dan­ger of death. Both em­pa­thy and com­pas­sion de­creased with in­creas­ing phy­lo­ge­netic dis­tance from hu­mans. How­ever, once again, this sur­vey method­ol­ogy does not al­low us to in­fer the ex­act nu­mer­i­cal trade­offs be­tween an­i­mals that re­spon­dents would en­dorse.

The gen­eral pub­lic has oc­ca­sion­ally been sur­veyed about spe­cific trade­offs. For in­stance, in March 2019 Scott Alexan­der ran a small sur­vey (n=50) ask­ing re­spon­dents to es­ti­mate the rel­a­tive value of non­hu­man an­i­mal lives in com­par­i­son to a hu­man life. The me­dian re­spon­dent in his sur­vey es­ti­mated that a sin­gle hu­man is as valuable as 4,000 lob­sters, 500 chick­ens, 50 cows, 35 pigs, 7 elephants, or 5 chim­panzees. Shortly there­after, a com­menter called Tib­bar at­tempted to repli­cate Alexan­der’s sur­vey with a larger pool of re­spon­dents (n=263). The re­sults were strik­ingly differ­ent, with Tib­bar’s re­spon­dents rank­ing the rel­a­tive value of a hu­man life much lower than Alexan­der’s re­spon­dents. Ac­cord­ing to the me­dian re­spon­dent in Tib­bar’s sur­vey, a hu­man life is as valuable as 60 lob­sters, 25 chick­ens, 5 pigs, 3 cows, and 2 chim­panzees. (Elephants scored as highly as hu­mans.)

Re­think Pri­ori­ties offered to in­ves­ti­gate the dis­crep­ancy be­tween Alexan­der’s and Tib­bar’s re­sults. We launched a new, larger sur­vey (n=490) and found enor­mous var­i­ance in the value as­signed to differ­ent an­i­mals. Many re­spon­dents as­signed each an­i­mal a value equal to hu­mans, and many re­spon­dents did es­sen­tially the op­po­site—in­di­cat­ing that hu­man life was in­com­men­su­rable with or in­finitely more valuable than non­hu­man an­i­mal life. In be­tween these po­si­tions there was an ex­treme range, with some re­spon­dents as­sign­ing a value to each an­i­mal nearly equal with hu­mans and other re­spon­dents as­sign­ing non­hu­man an­i­mals quadrillions times lower moral value than a sin­gle hu­man. Such variegated data posed many in­ter­pre­ta­tive challenges, but ul­ti­mately we con­cluded there were two nat­u­ral ways to an­a­lyze the data, one of which sup­ported Alexan­der’s high val­ues and one of which sup­ported Tib­bar’s lower figures. Alexan­der re­ported on our find­ings in May 2019. A full write-up from Re­think Pri­ori­ties is forth­com­ing.

Such sur­veys are not limited to pop­u­lar blogs. For ex­am­ple, in a 2007 phone sur­vey of one thou­sand Amer­i­cans, Ok­la­homa State agri­cul­tural economists Bailey Nor­wood and Jayson Lusk asked re­spon­dents to agree or dis­agree with the fol­low­ing state­ment: “If a new tech­nol­ogy were cre­ated that could ei­ther elimi­nate the suffer­ing of 1 hu­man or the suffer­ing of X farm an­i­mals, it should be used to elimi­nate the suffer­ing of the 1 hu­man.” The vari­able X was ran­domly set to 1, 10, 50, 100, 500, 1,000, 5,000, or 10,000. Ex­trap­o­lat­ing from the re­sults, Nor­wood and Lusk con­cluded that the av­er­age Amer­i­can be­lieves that the suffer­ing of one hu­man is equiv­a­lent to the suffer­ing of about 11,500 farm an­i­mals.[16]

A more re­cent sur­vey of this type is re­ported in Weathers et al. 2020. Re­spon­dents were asked to com­pare the suffer­ing of cows, pigs, and chick­ens via a se­ries of trade­off ques­tions. For ex­am­ple, a re­spon­dent might be asked to com­pare two hy­po­thet­i­cal pro­grams, the first of which would pre­vent one thou­sand cows from con­tract­ing an ill­ness that causes rapid death and the sec­ond of which would pre­vent X chick­ens from con­tract­ing a similar ill­ness that causes rapid death. Re­spon­dents were then re­quired to se­lect the low­est value of X for which the sec­ond pro­gram would pro­duce the greater over­all re­duc­tion in suffer­ing, with pos­si­ble val­ues for X rang­ing from one to one mil­lion.[17] Ac­cord­ing to the au­thors, “Ap­prox­i­mately 39.9% of par­ti­ci­pants val­ued cat­tle more than chick­ens, and a similar pro­por­tion (38.8%) val­ued pigs more than chick­ens” (Weathers et al. 2020: 4).[18]

Few of these sur­vey de­signs are ideal and none is perfect. Nev­er­the­less, the data pre­sented here do yield at least one ten­ta­tive con­clu­sion: many peo­ple are com­fortable en­dors­ing a hi­er­ar­chy of moral value. It ap­pears to be a com­monly—if not uni­ver­sally—ac­cepted view that some an­i­mals are more valuable than oth­ers, even if there is dis­agree­ment as to the ex­tent of the differ­ences. What re­mains to be seen is whether or not this po­si­tion is jus­tified. In the fol­low­ing sec­tion, I pre­sent some ev­i­dence that lay in­tu­itions about com­par­a­tive moral value should not be trusted.

The prob­lem with ap­peals to intuition

One ini­tial con­cern about ap­peals to in­tu­ition in this do­main is the gen­eral zo­olog­i­cal ig­no­rance of the in­tu­it­ing pub­lic. As I noted above, hu­mans di­rectly ex­ploit at least 33 or­ders of an­i­mals across 13 classes and 6 phyla. The av­er­age per­son sim­ply doesn’t know much de­tail about the lives of, say, goats, geese, carp, cat­fish, earth­worms, silk­worms, snails, or squid.[19] But with­out de­tailed knowl­edge of the char­ac­ter­is­tics of the species un­der com­par­i­son, it’s hard to see what could jus­tify judg­ments of com­par­a­tive moral worth. How­ever, I want to set this worry aside. I as­sume that if we adopted the holis­tic ap­proach, we would only care to elicit the in­tu­itions of qual­ified ex­perts.[20] My worry is that even the in­tu­itions of zo­olog­i­cal ex­perts will be un­re­li­able.

The holis­tic ap­proach is driven by all-things-con­sid­ered judg­ments about which species one ought to pre­fer to be and which trade­offs be­tween species are morally ap­pro­pri­ate. By their very na­ture, the ori­gin of these judg­ments is some­what opaque. The judg­ments are not the product of a clearly delineated al­gorithm or de­ci­sion tree. They don’t im­ply whether vari­a­tion in moral value is due to differ­ences in moral sta­tus or differ­ences in ca­pac­ity for welfare (or both). They cer­tainly don’t say which par­tic­u­lar fea­tures are driv­ing the differ­ence in moral value. In many ways this is a fea­ture, not a bug: de­vel­op­ing a for­mula for calcu­lat­ing moral value is difficult, and with­out rigor­ous, pro­tracted in­ves­ti­ga­tion, such a for­mula is un­likely to out­perform rapid in­tu­itive judg­ment. But the speed of in­tu­itive judg­ment comes at a price: when the ori­gin of a judg­ment is opaque, it’s eas­ier for un­wanted in­fluences to creep in with­out one’s knowl­edge.[21]

There is already a large liter­a­ture which demon­strates that un­cal­ibrated in­tu­itions are of­ten sen­si­tive to non-ev­i­den­tial fac­tors.[22] Since in­tu­itions about the com­par­a­tive moral value of non­hu­man an­i­mals are not amenable to in­de­pen­dent cal­ibra­tion, these in­tu­itions are al­most cer­tainly in­fluenced to some de­gree by fac­tors that are morally ir­rele­vant. So I think there is good rea­son in gen­eral to worry that un­wanted con­sid­er­a­tions un­duly sway one’s in­tu­itions about the value of non­hu­man an­i­mals. To com­pound this gen­eral worry, there are rea­sons to think that, in the spe­cific case at hand, ir­rele­vant fac­tors are likely to un­con­sciously taint our rea­son­ing.

There is am­ple ev­i­dence that hu­mans tend to value large mam­mals, es­pe­cially those with big eyes or hu­man-like char­ac­ter­is­tics, over other an­i­mals. An­i­mals with fur are preferred to those with scales; an­i­mals with two or four limbs are preferred to those with six, eight, or ten limbs. An­i­mals deemed to be ag­gres­sive, dirty, or dan­ger­ous are per­ceived nega­tively. Com­pan­ion an­i­mals and pets at­tract more sym­pa­thy than com­pa­rable farmed or wild an­i­mals.[23] Th­ese fac­tors, and many oth­ers, will plau­si­bly in­fluence our re­ac­tions to thought ex­per­i­ments.

Con­sider, for in­stance, Singer’s in­vi­ta­tion to as­sess whether we would pre­fer to ex­pe­rience life as one species rather than an­other. The goal of the ex­er­cise is to use our imag­i­na­tive fac­ul­ties to es­ti­mate the ca­pac­ity for welfare that differ­ent an­i­mals pos­sess.[24] In ad­di­tion to the above bi­ases, such thought ex­per­i­ments may be swayed by per­sonal con­sid­er­a­tions that have noth­ing to do with ca­pac­ity for welfare. Per­haps I would pre­fer to be a mar­lin rather than a chim­panzee be­cause I like to swim. Per­haps I would rather be a gecko than a po­lar bear be­cause I dis­like cold climes. Per­haps I’d pre­fer to be a pen­guin rather than a snake be­cause I know pen­guins en­gen­der more sym­pa­thy than snakes.

Similar con­cerns haunt Ka­gan’s in­vi­ta­tion to con­sider how much of one’s hu­man life one would sac­ri­fice to gain an ex­tra life­time as a mem­ber of an­other species. Per­haps I would gladly sac­ri­fice a year of my hu­man life for an ex­tra life­time as a spar­row be­cause the nov­elty of un­aided flight in­trigues me. Of course, in care­fully pre­sent­ing the thought ex­per­i­ment we would stipu­late that such per­sonal prefer­ences are ir­rele­vant and ought to be brack­eted when con­sid­er­ing the trade­off. But it’s un­clear ex­actly which per­sonal prefer­ences are ir­rele­vant, and even if we were con­fi­dent in our delineation of rele­vant ver­sus ir­rele­vant prefer­ences, it’s an open ques­tion how suc­cess­fully we can bracket the ir­rele­vant in­fluences by stipu­la­tion.[25] More­over, it’s also a bit un­clear how Ka­gan’s thought ex­per­i­ments could tell us much about moral sta­tus (as op­posed to ca­pac­ity for welfare). Mo­ral sta­tus pre­sum­ably makes no in­trin­sic con­tri­bu­tion to phe­nomenol­ogy, so if two an­i­mals had the same ca­pac­ity for welfare but differ­ent moral sta­tuses, it’s un­clear why I should be will­ing to give up more time for an ex­tra life as the an­i­mal with the higher sta­tus.

The sur­veys in the pre­vi­ous sec­tion some­times re­port re­sults that are best un­der­stood if we ac­cept that ir­rele­vant fac­tors of­ten drive our in­tu­itive re­sponses. For in­stance, in the APQ, re­spon­dents typ­i­cally don’t ob­ject to kil­ling mice and rats for med­i­cal re­search, ba­sic sci­ence re­search, or pest con­trol, but re­spon­dents do ob­ject to them be­ing kil­led for food pro­duc­tion. This is al­most cer­tainly be­cause re­spon­dents find the idea of eat­ing mice and rats un­ap­peal­ing, which, of course, is to­tally beside the point (Bradley et al. 2020: 17-18).

Then there is the specter of speciesism, which in sim­ple terms is a prej­u­dice in fa­vor of one’s own species. Jeff Sebo warns that “when we are con­sid­er­ing a topic like an­i­mal ethics, when our in­tu­itions are so heav­ily in­fluenced by speciesism and other such bi­ases, there is a risk that fo­cus­ing nar­rowly on sim­ple, ideal­ized thought ex­per­i­ments, as Ka­gan does, will an­chor us to an un­ac­cept­ably con­ser­va­tive and speciesist moral the­ory” (Sebo 2020: 6). Sebo adds that “we in­tu­itively un­der­es­ti­mate the ca­pac­i­ties of non­hu­man an­i­mals for a va­ri­ety of rea­sons. We tend to per­ceive hap­piness and au­ton­omy more eas­ily in hu­mans than in other an­i­mals. More­over, in­so­far as there are limits on how happy or au­tonomous an an­i­mal can be, we tend to at­tribute these limits to in­ter­nal causes, that is, facts about the an­i­mal, rather than to ex­ter­nal causes, that is, facts about the con­di­tions in which the an­i­mal is liv­ing. If we were to cor­rect for these ten­den­cies, then we would likely see less of a di­vide be­tween hu­mans and other an­i­mals than we cur­rently do” (Sebo 2020: 6). Sebo fo­cuses his crit­i­cism on com­par­i­sons be­tween hu­mans and non­hu­mans. But what about com­par­i­sons re­stricted only to non­hu­man an­i­mals?

The be­lief that hu­mans are more valuable than non­hu­mans is of­ten a man­i­fes­ta­tion of speciesism. But the idea that some non­hu­man an­i­mals are worth more than some other non­hu­man an­i­mals isn’t ob­vi­ously speciesist. If I judge that chick­ens are more valuable than trout, it’s not ob­vi­ous how such a judg­ment re­flects a prej­u­dice in fa­vor of my own species. Nonethe­less, the judg­ment might still be speciesist if hu­mans are un­jus­tifi­ably taken to be the stan­dard against which non­hu­man an­i­mals are com­pared, even when they are com­pared against each other. Similar­ity to hu­mans might be the met­ric by which most peo­ple eval­u­ate the com­par­a­tive moral value of non­hu­man an­i­mals, and if that is a speciesist crite­rion, then the com­par­i­sons will be tainted by speciesism.

The Atomistic Approach

All told, I think the above wor­ries sug­gest that we should search for a more ob­jec­tive ap­proach to mea­sur­ing com­par­a­tive moral value. Of course, no ap­proach to mea­sur­ing com­par­a­tive moral value will be com­pletely de­void of ap­peals to in­tu­ition and im­mune to the in­fluence of speciesism. How­ever, by stan­dard­iz­ing the in­puts to the model and ty­ing those in­puts to em­piri­cal data, we can make our in­tu­itions ex­plicit and pub­lic, the bet­ter to judge them. Such a model would hope­fully re­duce the ex­tent to which we are swayed by non-ev­i­den­tial fac­tors. It would also en­able us to pin­point the differ­ences driv­ing dis­agree­ments about judg­ments of com­par­a­tive moral value, mak­ing such dis­agree­ments more pro­duc­tive.

The holis­tic ap­proach to mea­sur­ing ca­pac­ity for welfare and moral sta­tus re­lies on all-things-con­sid­ered sub­jec­tive judg­ments. If such judg­ments are likely prone to er­rors that bias the pro­cess, we should be wary of the holis­tic ap­proach. One way to help cor­rect for these bi­ases is to ground our judg­ments wher­ever pos­si­ble in the hard facts of an­i­mal phys­iol­ogy, psy­chol­ogy, and ethol­ogy.[26] De­spite Ka­gan’s in­vi­ta­tion to en­ter­tain var­i­ous thought ex­per­i­ments, he con­cedes that “any par­tic­u­lar judg­ments we might make about how one type of an­i­mal ranks in com­par­i­son to oth­ers will be sub­ject to re­vi­sion in light of fur­ther ad­vances in em­piri­cal sci­ence. We may well dis­cover that we have over­es­ti­mated or un­der­es­ti­mated the psy­cholog­i­cal ca­pac­i­ties of any given type of an­i­mal. Un­sur­pris­ingly, then, any such rank­ing will re­main ten­ta­tive (and per­haps a bit rough as well). But in prin­ci­ple, at least, a suit­ably in­formed rank­ing could be pro­duced, and that rank­ing could then be im­proved upon as sci­ence re­veals more about the de­tails of an­i­mal psy­chol­ogy” (Ka­gan 2019: 113-114, em­pha­sis added).

The holis­tic ap­proach be­gins with ques­tions about the morally ap­pro­pri­ate trade­offs among species. An al­ter­na­tive is to first de­velop a rough sys­tem for ad­ju­di­cat­ing com­par­i­sons of ca­pac­ity for welfare and moral sta­tus. What I’m call­ing the atom­istic ap­proach be­gins with the ques­tion ‘What fea­tures and char­ac­ter­is­tics de­ter­mine moral sta­tus and ca­pac­ity for welfare?’ then uses the an­swers to that ques­tion to de­ter­mine the morally ap­pro­pri­ate trade­offs among species. The ap­proach is atom­istic be­cause it de­com­poses the ques­tion of com­par­a­tive moral value into dis­crete con­stituents (atoms) that are an­swered in­de­pen­dently and then ag­gre­gated. This ap­proach is nec­es­sar­ily more com­pli­cated, but the po­ten­tial gains in ac­cu­racy may be worth it. In the fol­low­ing sec­tion, I out­line what one such strat­egy for pro­duc­ing a suit­ably in­formed rank­ing might look like.

A rough guide to es­ti­mat­ing moral sta­tus and ca­pac­ity for welfare atomistically

Such an ap­proach might pro­ceed in three stages. The first stage would lay the con­cep­tual frame­work for the pro­ject.[27] Dur­ing this stage, one would spec­ify which fea­tures are likely to de­ter­mine ca­pac­ity for welfare and moral sta­tus.[28] This stage would not re­quire one to take a defini­tive stance on differ­ent the­o­ries of welfare and moral sta­tus, but it would re­quire one to de­ter­mine the im­pli­ca­tions of var­i­ous plau­si­ble views. Be­cause philo­soph­i­cal ques­tions are no­to­ri­ously difficult to re­solve, we should be fairly un­cer­tain about which the­o­ries of welfare and moral sta­tus are cor­rect. Given this deep un­cer­tainty, we should prob­a­bly value in­ter­ven­tions that are ro­bust across a num­ber of differ­ent plau­si­ble views. The goal of this stage would be twofold: (1) to gen­er­ate a rel­a­tively the­ory-neu­tral list of char­ac­ter­is­tics that might con­tribute to ca­pac­ity for welfare or moral sta­tus and (2) to un­der­stand the rel­a­tive im­por­tance of the char­ac­ter­is­tics, weighted both by the im­por­tance of the char­ac­ter­is­tic within a given the­ory and by the prob­a­bil­ity that the the­ory is true.

The sec­ond stage would lay the method­olog­i­cal frame­work for the pro­ject. Dur­ing this stage, one would op­er­a­tional­ize the fea­tures and char­ac­ter­is­tics enu­mer­ated dur­ing the first stage into mea­surable prox­ies. This stage would re­quire en­gage­ment with the em­piri­cal liter­a­ture so as to know what in prac­tice can be mea­sured. But this stage would also re­quire sub­stan­tive the­o­ret­i­cal rea­son­ing, to judge which met­rics are good prox­ies for the fea­tures we ul­ti­mately care about. A key goal of this stage would be to find mea­surable met­rics that can be mean­ingfully com­pared across phy­lo­ge­net­i­cally dis­tant an­i­mals. The met­rics must also be com­pa­rable in some sense to each other, so that they can be weighted against each other.

The third stage would be the sim­plest but the most time-in­ten­sive. First, we would se­lect the an­i­mals to be in­ves­ti­gated for the pro­ject (in­clud­ing the tax­o­nomic rank at which to com­pare them). Be­cause the goal of the pro­ject is to im­prove the way re­sources are al­lo­cated across in­ter­ven­tions, it makes sense to se­lect an­i­mals that hu­mans di­rectly ex­ploit in very large num­bers. Next, the rele­vant sci­en­tific liter­a­ture would be sys­tem­at­i­cally re­viewed and or­ga­nized, and the re­sults com­piled in a large database. The end-product might be a table con­sist­ing of ~30 fea­tures mea­sured across ~30 or­ders of an­i­mals. This tem­plate pro­vides an ex­am­ple frame­work.[29] (Note that the taxa and fea­tures are purely illus­tra­tive; they don’t rep­re­sent fi­nal judg­ments about what it would be worth­while to in­ves­ti­gate.) The database could ei­ther be used in­for­mally to guide and jus­tify all-things-con­sid­ered judg­ments, or it could be for­mal­ized into an al­gorithm that takes the table in­puts and con­verts them into nu­mer­i­cal es­ti­mates of com­par­a­tive moral value across differ­ent an­i­mals. For the lat­ter use, we ought also to con­duct a sen­si­tivity anal­y­sis and es­ti­mate our un­cer­tainty for all the in­put pa­ram­e­ters, so that we can iden­tify where the value of new in­for­ma­tion is high­est.

Such a pro­ject would be com­pa­rable in scope and struc­ture to Re­think Pri­ori­ties’ 2019 work on in­ver­te­brate sen­tience. Based on that anal­ogy, I es­ti­mate that mea­sur­ing ca­pac­ity for welfare and moral sta­tus in this way would re­quire some­where be­tween five thou­sand and seven thou­sand per­son-hours. In the rest of this sec­tion, I dis­cuss some the­o­ret­i­cal and prac­ti­cal ob­sta­cles that would need to be over­come in or­der to ad­e­quately mea­sure ca­pac­ity for welfare and moral sta­tus via the atom­istic ap­proach I have out­lined. The list is cer­tainly non-ex­haus­tive, but it should give a rep­re­sen­ta­tive fla­vor of the difficul­ties the atom­istic ap­proach faces. The ob­sta­cles are pre­sented in in­creas­ing or­der of se­ri­ous­ness.

Choos­ing tax­o­nomic rank

It is difficult to choose the right level of gen­er­al­ity at which to try to mea­sure ca­pac­ity for welfare and moral sta­tus. There are com­pet­ing con­sid­er­a­tions at play in this de­ci­sion. On the one hand, there is pres­sure to drill down to a fairly nar­row taxon (genus or species, say). The higher up the tax­o­nomic hi­er­ar­chy one goes, the more phy­lo­ge­net­i­cally di­verse a taxon be­comes. If a taxon be­comes too di­verse, then the fact that a par­tic­u­lar an­i­mal within the taxon pos­sesses some rele­vant fea­ture doesn’t guaran­tee that other an­i­mals within the taxon also pos­sess the fea­ture.

Re­call from the first post in the se­ries that, strictly speak­ing, ca­pac­ity for welfare and moral sta­tus are prop­er­ties of in­di­vi­d­u­als. Ob­vi­ously, it is not pos­si­ble to in­ves­ti­gate ev­ery in­di­vi­d­ual an­i­mal that might be sub­ject to some in­ter­ven­tion in or­der to de­ter­mine the in­di­vi­d­ual an­i­mal’s ca­pac­ity for welfare or moral sta­tus. As Ka­gan puts it: “After all, it would hardly be fea­si­ble to ex­pect us to un­der­take a de­tailed in­ves­ti­ga­tion of a given an­i­mal’s spe­cific psy­cholog­i­cal ca­pac­i­ties each time we were go­ing to in­ter­act with one. This makes it al­most in­evitable that in nor­mal cir­cum­stances we will as­sign a given an­i­mal on the ba­sis of its species (or, more likely still, on the ba­sis of even larger, more gen­eral biolog­i­cal cat­e­gories)” (Ka­gan 2019: 294). More­over, we should ex­pect that typ­i­cal vari­a­tion in ca­pac­ity for welfare and moral sta­tus among mem­bers of the same species to be min­i­mal. Gen­er­al­iz­ing to the level of species thus ap­pears un­prob­le­matic.

Although gen­er­al­iz­ing to the level of species poses lit­tle the­o­ret­i­cal difficulty, mea­sur­ing ca­pac­ity for welfare and moral sta­tus at that level is prob­a­bly prac­ti­cally in­fea­si­ble. There are hun­dreds of species that hu­mans ex­ploit in large num­bers. Not only would it be difficult to in­ves­ti­gate such a large num­ber of an­i­mals, but the lower one goes in the tax­o­nomic hi­er­ar­chy, the less re­search is available that per­tains to a given taxon. For all but the most com­monly stud­ied model or­ganisms, it would be im­pos­si­ble to fill in the database at the level of species.

Mov­ing a cou­ple rungs up the tax­o­nomic lad­der to the rank of fam­ily im­proves the situ­a­tion—but only slightly. There are ap­prox­i­mately 50-60 fam­i­lies of an­i­mals that are di­rectly ex­ploited in large num­bers. Mov­ing up an­other rung in the lad­der[30]—to the rank of or­der—re­duces the num­ber of taxa to be in­ves­ti­gated to a more man­age­able 33.[31] (Some of those an­i­mals, such as jel­lyfish, bi­valves, and ne­ma­todes, might be thought to lack moral stand­ing and thus could pos­si­bly be safely ig­nored, fur­ther re­duc­ing the fi­nal num­ber.) The differ­ence be­tween in­ves­ti­gat­ing moral sta­tus and ca­pac­ity for welfare at the rank of or­der rather than fam­ily could be as high as a thou­sand per­son-hours.

Un­for­tu­nately, mea­sur­ing ca­pac­ity for welfare and moral sta­tus at the rank of or­der may gloss over im­por­tant differ­ences among an­i­mals. To give just one ex­am­ple: hu­mans and lemurs are both in the or­der pri­mates. How­ever, many peo­ple find it im­plau­si­ble that hu­mans and lemurs have the same moral sta­tus or ca­pac­ity for welfare.[32] Thus, or­der may not be a fine-grained enough tax­o­nomic rank to cap­ture the rele­vant moral sta­tus facts.

Ul­ti­mately, the choice of tax­o­nomic rank in the pro­ject must be guided by a bal­ance of con­sid­er­a­tions. Move too low, and the pro­ject bal­loons in size and the prob­a­bil­ity of find­ing rele­vant sci­en­tific stud­ies for each taxon plum­mets. Move too high, and im­por­tant, ac­tion-rele­vant in­for­ma­tion will be missed. Per­son­ally, I think or­der is prob­a­bly the right rank at which to in­ves­ti­gate the sub­ject. In part this be­lief is driven by the view that if moral sta­tus ad­mits of de­grees, then moral sta­tus is dis­crete and or­ga­nized into a rel­a­tively small num­ber of lev­els.[33] The smaller the num­ber of tiers, the more sense it makes to in­ves­ti­gate moral sta­tus at a higher tax­o­nomic rank. It’s less plau­si­ble that ca­pac­ity for welfare is dis­crete, so if one thinks ca­pac­ity for welfare is much more im­por­tant than moral sta­tus in de­ter­min­ing char­ac­ter­is­tic moral worth, then one prob­a­bly ought to fa­vor a more fine-grained in­ves­ti­ga­tion.

Find­ing mea­surable proxies

The atom­istic ap­proach recom­mends first can­vass­ing the philo­soph­i­cal liter­a­ture to as­cer­tain which gen­eral fea­tures de­ter­mine ca­pac­ity for welfare and moral sta­tus. Across plau­si­ble philo­soph­i­cal views, there ap­pears to be a rough con­sen­sus as to what sorts of gen­eral fea­tures are rele­vant for moral sta­tus and ca­pac­ity for welfare. Ex­am­ples in­clude: in­ten­sity of valenced ex­pe­riences, self-aware­ness, gen­eral in­tel­li­gence, au­ton­omy, long-term plan­ning, com­mu­nica­tive abil­ity, af­fec­tive com­plex­ity, self-gov­er­nance, ab­stract thought, cre­ativity, so­cia­bil­ity, and nor­ma­tive eval­u­a­tion.[34] How­ever, it’s one thing to iden­tify gen­eral fea­tures that gov­ern ca­pac­ity for welfare and moral sta­tus. It’s an­other mat­ter en­tirely to find em­piri­cally mea­surable prox­ies for those fea­tures. If these fea­tures can­not be op­er­a­tional­ized in a way that al­lows them to be mea­sured, then the atom­istic ap­proach can­not suc­ceed.

The grav­ity of this con­cern de­pends of course on which fea­tures one be­lieves de­ter­mine moral sta­tus and ca­pac­ity for welfare. Some fea­tures are more amenable to op­er­a­tional­iza­tion than oth­ers. If one be­lieved, for in­stance, that neu­ron count wholly de­ter­mines moral sta­tus and ca­pac­ity for welfare, then one would have a rel­a­tively straight­for­ward method for mea­sur­ing moral worth.[35] Sadly, the view that neu­rons de­ter­mine moral sta­tus or ca­pac­ity for welfare ap­pears rather un­promis­ing. Neu­rons alone do not au­to­mat­i­cally gen­er­ate con­scious ex­pe­rience, and neu­rons are not them­selves in­trin­si­cally morally valuable.[36] Larger an­i­mals need more neu­rons just to co­or­di­nate move­ment and au­to­nomic func­tions. Larger an­i­mals also re­quire more neu­rons to in­ner­vate their larger mus­cles,[37] and larger an­i­mals tend to pro­cess larger sen­sory fields to in­ter­act with their larger world, which re­quires a greater num­ber of neu­rons just to pro­cess the data at the same level of com­plex­ity as a smaller an­i­mal would. Neu­ron counts alone do not tell us how the neu­rons are or­ga­nized, how the neu­rons are used, or how many synap­tic con­nec­tions each neu­ron pos­sesses. If neu­ron counts are worth in­ves­ti­gat­ing and com­par­ing at all, it’s only be­cause they are them­selves rough prox­ies for char­ac­ter­is­tics we care about. Per­haps neu­ron count cor­re­lates roughly with af­fec­tive so­phis­ti­ca­tion or in­ten­sity of valenced ex­pe­rience or gen­eral in­tel­li­gence.

Un­for­tu­nately, many of these po­ten­tially morally im­por­tant char­ac­ter­is­tics seem ex­tremely difficult to op­er­a­tional­ize, de­spite re­peated at­tempts to do so. For in­stance, a fea­ture as amor­phous as gen­eral in­tel­li­gence is un­likely to be cap­tured by any sin­gle met­ric. The biol­o­gists Lesley Rogers and Gisela Ka­plan put the point this way: “In­tel­li­gence is not an en­tity that can be mea­sured by perfor­mance on just one task, nor can it be in­ferred from brain size, as we dis­cuss be­low. Here it is worth not­ing that pi­geons, tested on a task based on one prob­lem taken from a stan­dard IQ test for hu­mans, which re­quired them to rec­og­nize sym­bols ro­tated at differ­ent an­gles, sur­passed hu­mans in perfor­mance of the same task (Delius, 1987). Would we there­fore rank them above us in in­tel­li­gence? Ob­vi­ously, the sin­gle crite­rion of as­sess­ment is an in­ad­e­quate mea­sure for in­tel­li­gence in a broad sense. Although IQ tests have some de­gree of limited val­idity in terms of pre­dictabil­ity of aca­demic suc­cess in a given cul­ture and class in hu­mans (Stern­berg, Gri­gorenko, and Bundy, 2001), there is in fact no sci­en­tifi­cally ac­cept­able way of mea­sur­ing in­tel­li­gence as a broad set of char­ac­ter­is­tics in hu­mans, let alone in an­i­mals. Add to this the am­bi­tion of mak­ing com­par­i­sons of in­tel­li­gence across species and it is easy to see how flawed such at­tempts would have to be” (Rogers & Ka­plan 2004: 177-178).

Although I am not quite so pes­simistic as Rogers and Ka­plan, it’s cer­tainly worth ac­knowl­edg­ing this difficulty at the out­set of any at­tempt to mea­sure gen­eral in­tel­li­gence. Since many of the morally im­por­tant fea­tures will be more akin to gen­eral in­tel­li­gence than neu­ron count, we should ex­pect the atom­istic ap­proach to al­lo­cate many hun­dreds of per­son-hours to the task of iden­ti­fy­ing mea­surable prox­ies for morally salient fea­tures. This task will al­most cer­tainly re­quire the col­lab­o­ra­tion of ex­perts across mul­ti­ple do­mains.

Com­par­ing fea­tures across animals

Even if mea­surable prox­ies for the fea­tures we care about could be found, we would still have to com­pare those prox­ies across an­i­mals. Since hu­mans ex­ploit such a large and phy­lo­ge­net­i­cally di­verse range of an­i­mals, com­par­ing the fea­tures is not go­ing to be easy. Ex­per­i­ments ex­plor­ing ro­dent in­tel­li­gence likely look very differ­ent from ex­per­i­ments ex­plor­ing eel in­tel­li­gence. Ex­per­i­ments ex­plor­ing self-con­trol in chick­ens likely look very differ­ent from ex­per­i­ments ex­plor­ing self-con­trol in oc­to­puses. Ex­per­i­ments ex­plor­ing cow emo­tions likely look very differ­ent from ex­per­i­ments ex­plor­ing fruit fly emo­tions. Ex­per­i­ments ex­plor­ing the so­cia­bil­ity of sheep likely look very differ­ent from ex­per­i­ments ex­plor­ing so­cia­bil­ity in honey bees. And so on. To prop­erly com­pare re­sults across ex­per­i­ments on differ­ent types of an­i­mals, some sort of nor­mal­iza­tion across stud­ies is re­quired.

For­tu­nately, there already ex­ists a sci­en­tific dis­ci­pline that aims to com­pare difficult-to-mea­sure fea­tures across differ­ent an­i­mals. Com­par­a­tive cog­ni­tion is an in­ter­dis­ci­plinary field at the in­ter­sec­tion of an­i­mal psy­chol­ogy, neu­rol­ogy, ethol­ogy, and evolu­tion­ary biol­ogy. Any at­tempt to mea­sure ca­pac­ity for welfare or moral sta­tus across an­i­mals will al­most cer­tainly rely heav­ily on com­par­a­tive cog­ni­tion stud­ies. There are promi­nent com­par­a­tive cog­ni­tion labs across the globe. Ex­am­ples in­clude Cam­bridge’s Com­par­a­tive Cog­ni­tion Lab, Univer­sity of Ex­eter’s Cen­tre for Re­search in An­i­mal Be­havi­our, Lund Univer­sity’s Cog­ni­tive Zool­ogy Group, Tufts Univer­sity’s Com­par­a­tive Cog­ni­tion Lab, Univer­sity of Helsinki’s Com­par­a­tive Mind Group, Rochester In­sti­tute of Tech­nol­ogy’s Com­par­a­tive Cog­ni­tion & Per­cep­tion Lab, and IGDORE’s In­ter­dis­ci­plinary Re­search Group in An­i­mal Be­havi­oural Science.

In the last 15 years there has been a surge of in­ter­est in com­par­ing species across a num­ber of differ­ent in­ter­est­ing met­rics. For ex­am­ple, MacLean et al. 2014 com­pare “the cog­ni­tive perfor­mance of 567 in­di­vi­d­u­als rep­re­sent­ing 36 species on two prob­lem-solv­ing tasks mea­sur­ing self-con­trol” (E2140). In ad­di­tion to di­rect com­par­i­sons, there are also a num­ber of meta-analy­ses that com­pile data from mul­ti­ple stud­ies to ar­rive at com­par­a­tive con­clu­sions. For ex­am­ple, Cau­choix et al. 2018 “gath­ered 44 stud­ies on in­di­vi­d­ual perfor­mance of 25 species across six an­i­mal classes” in an effort to un­der­stand the evolu­tion of cog­ni­tion. Mean­while, there has been a con­comi­tant surge in the­o­ret­i­cal dis­cus­sions about how to com­pare fea­tures across species. For ex­am­ple Weiss et al. 2019 out­line a quan­ti­ta­tive mea­sure of so­cial com­plex­ity that works across species and An­der­son & An­dolphs 2014 de­velop a frame­work for study­ing emo­tions across species. Th­ese stud­ies and oth­ers like them paint a promis­ing pic­ture of the po­ten­tial of com­par­a­tive cog­ni­tion.

How­ever, com­par­a­tive cog­ni­tion is a dis­ci­pline still in its in­fancy. Be­cause there may be a gen­eral bias to­ward non-null ex­per­i­men­tal re­sults in the sci­ences, es­pe­cially for rel­a­tively small and im­ma­ture fields, we should be cau­tious about the con­clu­sions of any one study (Ioan­ni­dis 2005). There is rea­son to think that aca­demic jour­nals fa­vor pa­pers with sur­pris­ing re­sults over pa­pers which merely con­firm the ex­pected. Thus, there may be a pub­li­ca­tion bias in fa­vor of an­i­mals do­ing sur­pris­ing things. In the pre­sent case that might mean that com­par­a­tive cog­ni­tion stud­ies which pur­port to demon­strate so­phis­ti­cated cog­ni­tive abil­ities in non­hu­mans are over­rep­re­sented in the liter­a­ture or that claims of com­pa­ra­bil­ity are ex­ag­ger­ated to gloss over un­der­min­ing com­pli­ca­tions. Repli­ca­tion stud­ies are, in gen­eral, un­der-re­warded in academia, so cor­rect­ing for this over­rep­re­sen­ta­tion and ex­ag­ger­a­tion may take years or even decades.

A re­cent (as yet un­pub­lished) crit­i­cism of the field sug­gests com­par­a­tive cog­ni­tion re­search is bi­ased be­cause “(1) Phenomenon-based com­par­a­tive cog­ni­tion uses con­fir­ma­tory re­search meth­ods that are di­rec­tion­ally bi­ased, (2) In com­bi­na­tion with a pub­li­ca­tion bias and a likely high rate of false dis­cov­er­ies, this bias sug­gests our liter­a­ture con­tains many false pos­i­tive find­ings, (3) This di­rec­tional bias per­sists even with strong method­olog­i­cal crit­i­cism, and when re­searchers ex­plic­itly con­sider al­ter­na­tive ex­pla­na­tions for the phe­nom­ena stud­ied, (4) No for­mal method ex­ists for gen­er­at­ing and as­sess­ing the­ory-dis­con­firm­ing ev­i­dence that could counter the bi­ased pos­i­tive ev­i­dence, (5) Am­bi­guity in defi­ni­tions al­low us as re­searchers to flex­ibly ad­just our sub­stan­tive claims de­pend­ing on whether we are re­fut­ing crit­i­cism or sel­l­ing the re­sults, (6) The small size of com­par­a­tive cog­ni­tion as a re­search field per­pet­u­ates and re­in­forces points (1) - (5)” (Far­rar & Os­to­jic 2019: 4). To­gether, these points fa­vor a healthy skep­ti­cism when new re­search analo­gizes be­hav­ior across species or as­cribes sur­pris­ingly so­phis­ti­cated cog­ni­tive abil­ities to non­hu­man an­i­mals.[38]

Some crit­ics of com­par­a­tive cog­ni­tion worry that some ques­tions the field pur­sues are premised on false as­sump­tions. Daniel McShea com­plains of the “already fraught ex­er­cise of mak­ing com­par­i­sons across species lines,” won­der­ing “how are we to com­pare the ca­pa­bil­ities of, say, dusky titi mon­keys with those of ba­boons? Dusky titis are smart about get­ting what they want, say, about the nu­ances of main­tain­ing a pair bond. Ba­boons are also quite smart, but about differ­ent things, like nav­i­gat­ing dom­i­nance hi­er­ar­chies. Since the two species want such differ­ent things, since they are mo­ti­vated to ap­ply their non-af­fec­tive ca­pac­i­ties for such differ­ent pur­poses, one won­ders whether it is even mean­ingful to ask which is smarter” (McShea 2017: 7). Ya­sushi Kiyokawa and Michael Hen­nessy worry that “the va­ri­ety of ap­proaches [...] pre­cludes any strict stan­dard­iza­tion of pro­ce­dures. Th­ese fac­tors and even the back­grounds of the re­searchers them­selves will con­tinue to pro­mote am­bi­guity and differ­ences of opinion” (Kiyokawa & Hen­nessy 2018). Writ­ing of com­par­i­sons of pain states in differ­ent an­i­mals, Edgar Walters and Amanda Willi­ams have noted “the difficulty in defin­ing pain in a way that al­lows pain [...] to be rec­og­nized and com­pared across species, a task that is es­pe­cially challeng­ing for at­tempted com­par­i­sons of the con­scious com­po­nent of pain,” ob­serv­ing that “there is con­sid­er­able un­cer­tainty about which be­havi­oural fea­tures, neu­ral cir­cuits, cell types and molecules to com­pare across taxa” (Walters & Willi­ams 2019: 6). In light of these difficul­ties, Lesley Rogers and Gisela Ka­plan ar­gue that “given our pre­sent state of knowl­edge of the needs and ca­pa­bil­ities of classes of an­i­mals, let alone in­di­vi­d­ual species, we feel, as biol­o­gists, that we first and fore­most ought to guard against, or at least be very cau­tious about, the temp­ta­tion of cre­at­ing a scale of lesser or greater value of one species over an­other” (Rogers & Ka­plan 2004: 196).

Even the best case looks un­promis­ing. Sup­pose we adopted the view that neu­ron count is a good proxy for moral sta­tus and ca­pac­ity for welfare. At first blush, that seems like a rel­a­tively easy fea­ture to mea­sure and com­pare. But, as it hap­pens, neu­rons aren’t all cre­ated equal be­cause not all ar­eas of the brain are equally im­por­tant. Brain re­gions that, say, merely in­ner­vate mus­cles are plau­si­bly less im­por­tant to moral sta­tus and ca­pac­ity for welfare than brain re­gions that, say, gov­ern emo­tional re­sponses.[39] Thus, across species, the num­ber of neu­rons in cer­tain brain re­gions may be more in­for­ma­tive than over­all neu­ron count. Two an­i­mals with the same over­all num­ber of neu­rons might differ in morally salient ways if those neu­rons are dis­tributed differ­ently across brain re­gions. But even when com­par­ing the same com­pa­rably-sized brain re­gions across species, var­i­ous cy­toar­chi­tec­tural differ­ences, such as the ex­tent of cor­ti­cal fold­ing, in­terneu­ronal dis­tance, ax­onal con­duc­tion ve­loc­ity, de­gree of myeli­na­tion, and synap­tic trans­mis­sion speed, could plau­si­bly be more im­por­tant still. To prop­erly com­pare neu­rons, we need to know where they are lo­cated and how they are con­nected to each other. In many ways, the fore­go­ing is an ar­gu­ment against tak­ing neu­ron count to be a good proxy of moral sta­tus or ca­pac­ity for welfare. But I hope it also serves to illus­trate the gen­eral challenge of com­par­ing even rel­a­tively sim­ple phys­iolog­i­cal fea­tures, to say noth­ing of more com­plex, amor­phous fea­tures such as gen­eral in­tel­li­gence or af­fec­tive so­phis­ti­ca­tion.

All told, these wor­ries sug­gest that com­par­ing morally rele­vant fea­tures across phy­lo­ge­net­i­cally dis­tant an­i­mals will be fraught with the­o­ret­i­cal and prac­ti­cal challenges. It will not always be clear which re­sults are truly com­pa­rable, and as such, many difficult judg­ment calls will be re­quired. It’s pos­si­ble that these sub­jec­tive judg­ment calls will be so nu­mer­ous and so in­escapable that any pre­tense of ob­jec­tivity will be lost. In that case, the best method for im­prov­ing our abil­ity to mea­sure ca­pac­ity for welfare and moral sta­tus might be to fund more rigor­ous work at var­i­ous com­par­a­tive cog­ni­tion labs so that bet­ter pro­ce­du­ral stan­dards for com­par­i­son can be de­vel­oped.

Weight­ing the features

After can­vass­ing the philo­soph­i­cal liter­a­ture to find char­ac­ter­is­tics that con­tribute to moral sta­tus and ca­pac­ity for welfare and can­vass­ing the sci­en­tific liter­a­ture to find mea­surable prox­ies for those char­ac­ter­is­tics, we will have a gen­eral list of fea­tures to in­ves­ti­gate.[40] How­ever, there is good cause to an­ti­ci­pate that not all of those fea­tures will be equally im­por­tant. There are at least three rea­sons to think that the fea­tures will need to be weighted.

The first rea­son is that within a given the­ory of moral sta­tus or ca­pac­ity for welfare, some fea­tures are more sig­nifi­cant than oth­ers. This differ­ence could be due to value plu­ral­ism. Ob­jec­tive list the­o­ries of welfare in­clude among the list of in­trin­sic goods items such as hap­piness, virtue, wis­dom, friend­ship, knowl­edge, and love. Th­ese items need not con­tribute to welfare equally. Even for a value monist view like he­do­nism, differ­ent fea­tures ought to be as­signed differ­ent weights. There is no sin­gle proxy that perfectly cap­tures ca­pac­ity for pain and plea­sure. A he­do­nist might think that self-aware­ness, lin­guis­tic so­phis­ti­ca­tion, af­fec­tive com­plex­ity, so­cia­bil­ity, and long-term plan­ning all in­fluence the range and in­ten­sity of pos­si­ble plea­sures and pains a crea­ture can ex­pe­rience. But it would be quite sur­pris­ing if such di­verse fea­tures con­tributed equally.

The sec­ond rea­son is that some fea­tures will be rele­vant to more the­o­ries than oth­ers, and in a rel­a­tively the­ory-neu­tral frame­work, the fea­tures ought to be weighted ac­cord­ing to how many the­o­ries they are rele­vant to. Since the goal is to de­velop in­ter­ven­tions that are ro­bust in the face of our moral un­cer­tainty, we should prob­a­bly pay more at­ten­tion to fea­tures that are salient across a spec­trum of the­o­ries. For in­stance, al­though he­do­nism holds that pains and plea­sures are the only things that mat­ter for welfare, vir­tu­ally all plau­si­ble the­o­ries of welfare hold that pains and plea­sures are rele­vant to welfare, ei­ther di­rectly or in­di­rectly. Thus, per­haps, ex­pe­ri­en­tial fea­tures de­serve more weight in the frame­work than, say, agen­tial fea­tures[41] that may be rele­vant to fewer the­o­ries.

For sim­plic­ity I’m here as­sum­ing perfect the­ory-neu­tral­ity among a small num­ber of the­o­ries. In re­al­ity, al­though we might want to con­sider mul­ti­ple the­o­ries, we might lean to­ward some the­o­ries more than oth­ers. In that case, each fea­ture would need to be weighted not only by the num­ber of the­o­ries to which the fea­ture is rele­vant but also by the plau­si­bil­ity of the the­o­ries to which the the­ory is rele­vant. And even if a fea­ture is rele­vant to mul­ti­ple the­o­ries, it may not be equally im­por­tant to each the­ory. So the fea­tures would need to be weighted not only ac­cord­ing to how many the­o­ries to which they are rele­vant, but also by how im­por­tant they are to the the­o­ries for which they are rele­vant.

The fi­nal rea­son is that we may want to op­er­a­tional­ize char­ac­ter­is­tics in more than one way. For in­stance, we may want to in­clude a col­lec­tion of phys­iolog­i­cal fea­tures[42] in the database. Phys­iolog­i­cal fea­tures plau­si­bly aren’t in­trin­si­cally valuable; they’re merely prox­ies for other cat­e­gories. Sup­pose, for in­stance, we thought gen­eral in­tel­li­gence (what­ever that means ex­actly) is pretty im­por­tant for moral sta­tus or ca­pac­ity for welfare. We might want to es­ti­mate gen­eral in­tel­li­gence through a com­bi­na­tion of neu­rolog­i­cal fea­tures (e.g., en­cephal­iza­tion quo­tient, cor­ti­cal neu­ron count) and be­hav­ioral fea­tures (e.g., tool use, un­cer­tainty mon­i­tor­ing). Since no proxy will be perfect and it will be un­clear which proxy is best, we will prob­a­bly want to op­er­a­tional­ize most char­ac­ter­is­tics in mul­ti­ple ways. This needs to be re­flected in the weight­ing sys­tem. If some char­ac­ter­is­tics are op­er­a­tional­ized in more ways than oth­ers, then weight­ing fea­tures equally would amount to dou­ble-count­ing some char­ac­ter­is­tics.

Weight­ing the fea­tures looks im­por­tant and in­evitable. But do­ing so will be in­cred­ibly tough. Jean Kazez ex­plains the difficulty thusly: “There are many ca­pac­i­ties to which we as­sign pos­i­tive value, but we don’t always have a definite idea of their rel­a­tive val­ues. If we’re try­ing to rank bower birds, crows, and wolves, it de­pends what’s more valuable, artis­tic abil­ity (which fa­vors the bower bird) or sheer in­tel­li­gence (which fa­vors the crow) or so­cia­bil­ity (which fa­vors the wolf). We’re not go­ing to be able to put these three species on sep­a­rate rungs of a lad­der, in any par­tic­u­lar or­der, and nei­ther is the situ­a­tion quite as crisp as a straight­for­ward tie. We just don’t know how to as­sign them a place on the lad­der, rel­a­tive to each other” (Kazez 2010: 87-88).

The trou­ble gets even worse when we con­sider so-called com­bi­na­tion effects: “A prop­erty might raise the moral sta­tus of one be­ing but not an­other, be­cause it might raise moral sta­tus only when com­bined with cer­tain other prop­er­ties” (Har­man 2003: 177-178). For ex­am­ple, it might be the case that a cer­tain de­gree of au­ton­omy is re­quired be­fore some proso­cial ca­pac­i­ties con­tribute to moral sta­tus. Maybe nur­tur­ing be­hav­ior that is en­tirely pre-pro­grammed and in­stinc­tive counts for less than love freely given. Honey bees and cows both care for their young, but if we think cows have a greater ca­pac­ity for ra­tio­nal choice than honey bees, then the same level of ju­ve­nile guardian­ship might raise the moral sta­tus of cows more than honey bees.[43] Thus, the fea­ture weights may not be static or in­de­pen­dent. In­stead, they might be dy­namic and in­ter­de­pen­dent. That is, the weight of a given fea­ture may de­pend on the pres­ence or ab­sence of other fea­tures. Ac­count­ing for this com­plex­ity ap­pears stag­ger­ingly hard.

In sum, ac­cu­rate point es­ti­mates of the rel­a­tive weights of the fea­tures are prob­a­bly un­achiev­able. The in­tel­lec­tu­ally hon­est thing to do is to as­sign each fea­ture a range of weights.[44] But these ranges might be so wide as to rob the pro­ject of any ac­tion-guid­ing con­clu­sions. That is, one’s views about the com­par­a­tive moral value of differ­ent an­i­mals might de­pend al­most wholly on how one in­ter­prets the rel­a­tive im­por­tance of differ­ent fea­tures. If there’s noth­ing to be done to jus­tifi­ably nar­row the range of plau­si­ble weights, then the pro­ject may well end up coun­te­nanc­ing a huge range of trade­offs among species. If the pro­ject doesn’t tell us any­thing prac­ti­cal about which trade­offs are per­mis­si­ble, then it is al­most use­less.


One can­not work in effec­tive an­i­mal ad­vo­cacy and wholly ig­nore the ques­tion of com­par­a­tive moral value. Re­sources are finite, so trade­offs among differ­ent an­i­mals are in­evitable. Every time an or­ga­ni­za­tion launches a cam­paign, a re­searcher in­ves­ti­gates a ques­tion, or a grant­maker funds a pro­ject, trade­offs are made. The time, money, and at­ten­tion that are de­voted to one species could also have been de­voted to an­other.

Although prac­ti­cal con­cerns will always guide our de­ci­sion-mak­ing to some ex­tent, it’s im­por­tant to think about how we would ideally like to dis­tribute re­sources. If the ideal dis­tri­bu­tion of re­sources differs sig­nifi­cantly from what is cur­rently fea­si­ble, we ought to de­vote time and money to sur­mount­ing those ob­sta­cles. It is not an im­mutable fact that, say, fish elicit lit­tle sym­pa­thy from the gen­eral pub­lic or that fish welfare or­ga­ni­za­tions are few in num­ber. If fish welfare is ne­glected rel­a­tive to its im­por­tance, we can use our re­sources to im­prove tractabil­ity and over­come limit­ing fac­tors.

To un­der­stand the ideal dis­tri­bu­tion of re­sources, we must un­der­stand the com­par­a­tive value of differ­ent types of an­i­mals. If an­i­mal ex­pe­riences and in­ter­ests are all equal, then we should aim for in­ter­ven­tions that have the great­est im­pact for the great­est num­ber. Such a view would al­most cer­tainly push us to care more for small, nu­mer­ous in­ver­te­brates than we cur­rently do.

How­ever, if some an­i­mals lack moral stand­ing, then we should ex­clude those an­i­mals from our moral con­sid­er­a­tion. If that’s the case, it’s im­por­tant to know where to draw the line be­tween an­i­mals that have moral stand­ing and those that do not. It’s also im­por­tant to know the con­se­quences of draw­ing the line in one place rather than an­other, how con­fi­dent we are in draw­ing the line, and what in­for­ma­tion would cause us to change where we draw the line.

If some an­i­mals have moral stand­ing and oth­ers lack it, then by defi­ni­tion an­i­mals differ in their moral sta­tus and ca­pac­ity for welfare. But there are plau­si­ble rea­sons to think that among the an­i­mals that have moral stand­ing, there will be fur­ther vari­a­tion in moral sta­tus or ca­pac­ity for welfare. If that’s the case, then sav­ing the lives of one type of an­i­mal will not always have the same in­trin­sic moral value as sav­ing the lives of other types of an­i­mals.[45] The suffer­ing of, say, chim­panzees and oc­to­puses may count for more, morally, than the suffer­ing of, say, meal­worms and prawn.

To de­ter­mine the ideal al­lo­ca­tion of re­sources, we need some way to mea­sure ca­pac­ity for welfare and moral sta­tus. Do­ing so will not be easy. The philo­soph­i­cal ter­rain is treach­er­ous. Our in­tu­itions are im­pre­cise and likely skewed. The sci­en­tific liter­a­ture is vast but un­cer­tain. Still, even a small re­duc­tion in our un­cer­tainty could make a big differ­ence to our al­loca­tive de­ci­sion-mak­ing. There are many years ahead to re­fine such es­ti­mates. But we can’t put off mak­ing trade­offs now.


This es­say is a pro­ject of Re­think Pri­ori­ties. It was writ­ten by Ja­son Schukraft. Thanks to Kim Cud­ding­ton, Mar­cus A. Davis, Neil Dul­laghan, David Moss, and Jeff Sebo for helpful feed­back. If you like our work, please con­sider sub­scribing to our newslet­ter. You can see all our work to date here.

