# AidanGoth

Karma: 96
• Thanks for the clar­ifi­ca­tion—I see your con­cern more clearly now. You’re right, my model does as­sume that all balls were coloured us­ing the same pro­ce­dure, in some sense—I’m as­sum­ing they’re in­de­pen­dently and iden­ti­cally dis­tributed.

Your case is an­other rea­son­able way to ap­ply the max­i­mum en­tropy prin­ci­ple and I think it’s points to an­other prob­lem with the max­i­mum en­tropy prin­ci­ple but I think I’d frame it slightly differ­ently. I don’t think that the max­i­mum en­tropy prin­ci­ple is ac­tu­ally di­rectly prob­le­matic in the case you de­scribe. If we as­sume that all balls are coloured by com­pletely differ­ent pro­ce­dures (i.e. so that the colour of one ball doesn’t tell us any­thing about the colours of the other balls), then see­ing 99 red balls doesn’t tell us any­thing about the fi­nal ball. In that case, I think it’s rea­son­able (even re­quired!) to have a 50% cre­dence that it’s red and un­rea­son­able to have a 99% cre­dence, if your prior was 50%. If you find that re­sult coun­ter­in­tu­itive, then I think that’s more of a challenge to the as­sump­tion that the balls are all coloured in such a way that learn­ing the colour of some doesn’t tell you any­thing about the colour of the oth­ers rather than a challenge to the max­i­mum en­tropy prin­ci­ple. (I ap­pre­ci­ate you want to as­sume noth­ing about the colour­ing pro­cesses rather than mak­ing the as­sump­tion that the balls are all coloured in such a way that learn­ing the colour of some doesn’t tell you any­thing about the colour of the oth­ers, but in set­ting up your model this way, I think you’re as­sum­ing that im­plic­itly.)

Per­haps an­other way to see this: if you don’t fol­low the max­i­mum en­tropy prin­ci­ple and in­stead have a prior of 30% that the fi­nal ball is red and then draw 99 red balls, in your sce­nario, you should main­tain 30% cre­dence (if you don’t, then you’ve as­sumed some­thing about the colour­ing pro­cess that makes the balls not in­de­pen­dent). If you find that coun­ter­in­tu­itive, then the is­sue is with the as­sump­tion that the balls are all coloured in such a way that learn­ing the colour of some doesn’t tell you any­thing about the colour of the oth­ers be­cause we haven’t used the prin­ci­ple of max­i­mum en­tropy in that case.

I think this ac­tu­ally points to a differ­ent prob­lem with the max­i­mum en­tropy prin­ci­ple in prac­tice: we rarely come from a po­si­tion of com­plete ig­no­rance (or com­plete ig­no­rance be­sides a given mean, var­i­ance etc.), so it’s ac­tu­ally rarely ap­pli­ca­ble. Fol­low­ing the prin­ci­ple some­times gives coun­ter­in­tu­ive/​un­rea­son­able re­sults be­cause we ac­tu­ally know a lot more than we re­al­ise and we lose much of that in­for­ma­tion when we ap­ply the max­i­mum en­tropy prin­ci­ple.

• The max­i­mum en­tropy prin­ci­ple does give im­plau­si­ble re­sults if ap­plied care­lessly but the above rea­son­ing seems very strange to me. The nor­mal way to model this kind of sce­nario with the max­i­mum en­tropy prior would be via Laplace’s Rule of Suc­ces­sion, as in Max’s com­ment be­low. We start with a prior for the prob­a­bil­ity that a ran­domly drawn ball is red and can then up­date on 99 red balls. This gives a 100101 chance that the fi­nal ball is red (about 99%!). Or am I miss­ing your point here?

Some­what more for­mally, we’re look­ing at a Bernoulli trial—for each ball, there’s a prob­a­bil­ity p that it’s red. We start with the max­i­mum en­tropy prior for p, which is the uniform dis­tri­bu­tion on the in­ter­val [0,1] (= beta(1,1)). We up­date on 99 red balls, which gives a pos­te­rior for p of beta(100,1), which has mean 100101 (this is a stan­dard re­sult, see e.g. con­ju­gate pri­ors - the beta dis­tri­bu­tion is a con­ju­gate prior for a Bernoulli like­li­hood).

The more com­mon ob­jec­tion to the max­i­mum en­tropy prin­ci­ple comes when we try to reparametrise. A nice but sim­ple ex­am­ple is van Fraassen’s cube fac­tory: a fac­tory man­u­fac­tures cubes up to 2x2x2 feet, what’s the prob­a­bil­ity that a ran­domly se­lected cube has side length less than 1 foot? If we ap­ply the max­i­mum en­tropy prin­ci­ple (MEP), we say 12 be­cause each cube has length be­tween 0 and 2 and MEP im­plies that each length is equally likely. But we could have equiv­a­lently asked: what’s the prob­a­bil­ity that a ran­domly se­lected cube has face area less than 1 foot squared? Face area ranges from 0 to 4, so MEP im­plies a prob­a­bil­ity of 14. All and only those cubes with side length less than 1 have face area less than 1, so these are pre­cisely the same events but MEP gave us differ­ent an­swers for their prob­a­bil­ities! We could do the same in terms of vol­ume and get a differ­ent an­swer again. This in­con­sis­tency is the kind of im­plau­si­ble re­sult most com­monly pointed to.

• An im­por­tant differ­ence be­tween over­all bud­gets and job boards is that bud­gets tell you how all the re­sources are spent whereas job boards just tell you how (some of) the re­sources are spent on the mar­gin. EA could spend a lot of money on some area and/​or em­ploy lots of peo­ple to work in that area with­out ac­tively hiring new peo­ple. We’d miss that by just look­ing at the job board.

I think this is a nice sug­ges­tion for get­ting a rough idea of EA pri­ori­ties but be­cause of this + Habryka’s ob­ser­va­tion that the 80k job board is not rep­re­sen­ta­tive of new jobs in and around EA, I’d cau­tion against putting much weight on this.

• The la­tex isn’t dis­play­ing well (for me at least!) which makes this re­ally hard to read. You just need to press ‘ctrl’/​‘cmd’ and ‘4’ for in­line la­tex and ‘ctrl’/​‘cmd’ and ‘M’ for block :)

• I found the an­swers to this ques­tion on stats.stack­ex­change use­ful for think­ing about and get­ting a rough overview of “un­in­for­ma­tive” pri­ors, though it’s mainly a bit too tech­ni­cal to be able to eas­ily ap­ply in prac­tice. It’s aimed at for­mal Bayesian in­fer­ence rather than more gen­eral fore­cast­ing.

In in­for­ma­tion the­ory, en­tropy is a mea­sure of (lack of) in­for­ma­tion—high en­tropy dis­tri­bu­tions have low in­for­ma­tion. That’s why the prin­ci­ple of max­i­mum en­tropy, as Max sug­gested, can be use­ful.

Another meta an­swer is to use Jeffrey’s prior. This has the prop­erty that it is in­var­i­ant un­der a change of co­or­di­nates. This isn’t the case for max­i­mum en­tropy pri­ors in gen­eral and is a source of in­con­sis­tency (see e.g. the par­ti­tion prob­lem for the prin­ci­ple of in­differ­ence, which is just a spe­cial case of the prin­ci­ple of max­i­mum en­tropy). Jeffrey’s pri­ors are of­ten un­wieldy, but one im­por­tant ex­cep­tion is for the in­ter­val (e.g. for a prob­a­bil­ity), for which the Jeffrey’s prior is the dis­tri­bu­tion. See the red line in the graph at the top of the beta dis­tri­bu­tion Wikipe­dia page—the den­sity is spread to the edges close to 0 and 1.

This re­lates to Max’s com­ment about Laplace’s Rule of Suc­ces­sion: tak­ing N_v = 2, M_v = 1 cor­re­sponds to the uniform dis­tri­bu­tion on (which is just beta(1,1)). This is the max­i­mum en­tropy en­tropy dis­tri­bu­tion on . But as Max men­tioned, we can vary N_v and M_v. Us­ing Jeffrey’s prior would be like set­ting N_v = 1 and M_v = 12, which doesn’t have as nice an in­ter­pre­ta­tion (1/​2 a suc­cess?) but has nice the­o­ret­i­cal fea­tures. Espe­cially use­ful if you want to put the den­sity around 0 and 1 but still have mean 12.

There’s a bit more dis­cus­sion of Laplace’s Rule of Su­ces­sion and Jeffrey’s prior in an EA con­text in Toby Ord’s com­ment in re­sponse to Will MacAskill’s Are we liv­ing at the most in­fluen­tial time in his­tory?

Fi­nally, a bit of a cop-out, but I think worth men­tion­ing, is the sug­ges­tion of im­pre­cise cre­dences in one of the an­swers to the stats.stack­ex­change ques­tion linked above. Select a range of pri­ors and see­ing how much they con­verge, you might find prior choice doesn’t mat­ter that much and when it does mat­ter, I ex­pect this could be use­ful for de­ter­min­ing your largest un­cer­tain­ties.

• Reflect­ing on this ex­am­ple and your x-risk ques­tions, this high­lights the fact that in the beta(0.1,0.1) case, we’re ei­ther very likely fine or re­ally screwed, whereas in the beta(20,20) case, it’s similar to a fair coin toss. So it feels eas­ier to me to get mo­ti­vated to work on miti­gat­ing the sec­ond one. I don’t think that says much about which is higher pri­or­ity to work on though be­cause re­duc­ing the risk in the first case could be su­per valuable. The value of in­for­ma­tion nar­row­ing un­cer­tainty in the first case seems much higher though.

• Nice post! Here’s an illus­tra­tive ex­am­ple in which the dis­tri­bu­tion of mat­ters for ex­pected util­ity.

Say you and your friend are de­cid­ing whether to meet up but there’s a risk that you have a nasty, trans­mis­si­ble dis­ease. For each of you, there’s the same prob­a­bil­ity that you have the dis­ease. As­sume that whether you have the dis­ease is in­de­pen­dent of whether your friend has it. You’re not sure if has a beta(0.1,0.1) dis­tri­bu­tion or a beta(20,20) dis­tri­bu­tion, but you know that the ex­pected value of is 0.5.

If you meet up, you get +1 util­ity. If you meet up and one of you has the dis­ease, you’ll trans­mit it to the other per­son, and you get −3 util­ity. (If you both have the dis­ease, then there’s no coun­ter­fac­tual trans­mis­sion, so meet­ing up is just worth +1.) If you don’t meet up, you get 0 util­ity.

It makes a differ­ence which dis­tri­bu­tion has. Here’s an in­tu­itive ex­pla­na­tion. In the first case, it’s re­ally un­likely that one of you has it but not the other. Most likely, ei­ther (i) you both have it, so meet­ing up will do no ad­di­tional harm or (ii) nei­ther of you has it, so meet­ing up is harm­less. In the sec­ond case, it’s rel­a­tively likely that one of you has the dis­ease but not the other, so you’re more likely to end up with the bad out­come.

If you crunch the num­bers, you can see that it’s worth meet­ing up in the first case, but not in the sec­ond. For this to be true, we have to as­sume con­di­tional in­de­pen­dence: that you and your friend hav­ing the dis­ease are in­de­pen­dent events, con­di­tional on the prob­a­bil­ity of an ar­bi­trary per­son hav­ing the dis­ease be­ing . It doesn’t work if we as­sume un­con­di­tional in­de­pen­dence but I think con­di­tional in­de­pen­dence makes more sense.

The calcu­la­tion is a bit long-winded to write up here, but I’m happy to if any­one is in­ter­ested in see­ing/​check­ing it. The gist is to write the prob­a­bil­ity of a state ob­tain­ing as the in­te­gral wrt of the prob­a­bil­ity of that state ob­tain­ing, con­di­tional on , mul­ti­plied by the pdf of (i.e. ). Separate the states via con­di­tional in­de­pen­dence (i.e. ) and plug in val­ues (e.g. P(you have it|p)=p) and in­te­grate. Here’s the calcu­la­tion of the prob­a­bil­ity you both have it, as­sum­ing the beta(0.1,0.1) dis­tri­bu­tion. Then calcu­late the ex­pected util­ity of meet­ing up as nor­mal, with the util­ities above and the prob­a­bil­ities calcu­lated in this way. If I haven’t messed up, you should find that the ex­pected util­ity is pos­i­tive in the beta(0.1,0.1) case (i.e. bet­ter to meet up) and nega­tive in the beta(20,20) case (i.e. bet­ter not to meet up).

• Thanks, this is a good crit­i­cism. I think I agree with the main thrust of your com­ment but in a bit of a round­about way.

I agree that fo­cus­ing on ex­pected value is im­por­tant and that ideally we should com­mu­ni­cate how ar­gu­ments and re­sults af­fect ex­pected val­ues. I think it’s helpful to dis­t­in­guish be­tween (1) ex­pected value es­ti­mates that our mod­els out­put and (2) the over­all ex­pected value of an ac­tion/​in­ter­ven­tion, which is in­formed by our mod­els and ar­gu­ments etc. The guessti­mate model is so spec­u­la­tive that it doesn’t ac­tu­ally do that much work in my over­all ex­pected value, so I don’t want to overem­pha­sise it. Per­haps we un­der-em­pha­sised it though.

The non-prob­a­bil­is­tic model is also spec­u­la­tive of course, but I think this offers stronger ev­i­dence about the rel­a­tive cost-effec­tive­ness than the out­put of the guessti­mate model. It doesn’t offer a pre­cise num­ber in the same way that the guessti­mate model does but the guessti­mate model only does that by mak­ing ar­bi­trary dis­tri­bu­tional as­sump­tions, so I don’t think it adds much in­for­ma­tion. I think that the non-prob­a­bil­is­tic model offers ev­i­dence of greater cost-effec­tive­ness of THL rel­a­tive to AMF (given he­do­nism, anti-speciesism) be­cause THL tends to come out bet­ter and some­times comes out much, much bet­ter. I also think this isn’t su­per strong ev­i­dence but that you’re right that our sum­mary is overly ag­nos­tic, in light of this.

In case it’s helpful, here’s a pos­si­ble ex­pla­na­tion for why we com­mu­ni­cated the find­ings in this way. We ac­tu­ally came into this pro­ject ex­pect­ing THL to be much more cost-effec­tive, given a wide range of as­sump­tions about the pa­ram­e­ters of our model (and as­sum­ing he­do­nism, anti-speciesism) and we were sur­prised to see that AMF could plau­si­bly be more cost-effec­tive. So for me, this pro­ject gave an up­date slightly in favour of AMF in terms of ex­pected cost-effec­tive­ness (though I was prob­a­bly pre­vi­ously over­con­fi­dent in THL). For many pri­ors, this pro­ject should up­date the other way and for even more pri­ors, this pro­ject should leave you ex­pect­ing THL to be more cost-effec­tive. I ex­pect we were a bit torn in com­mu­ni­cat­ing how we up­dated and what the pro­ject showed and didn’t have the time to think this through and write this down ex­plic­itly, given other pro­jects com­pet­ing for our time and en­ergy. It’s been helpful to clar­ify a few things through this dis­cus­sion though :)

• Thanks for rais­ing this. It’s a fair ques­tion but I think I dis­agree that the num­bers you quote should be in the top level sum­mary.

I’m wary of overem­pha­sis­ing pre­cise num­bers. We’re re­ally un­cer­tain about many parts of this ques­tion and we ar­rived at these num­bers by mak­ing many strong as­sump­tions, so these num­bers don’t rep­re­sent our all-things-con­sid­ered-view and it might be mis­lead­ing to state them with­out a lot of con­text. In par­tic­u­lar, the num­bers you quote came from the Guessti­mate model, which isn’t where the bulk of the work on this pro­ject was fo­cused (though we could have ac­knowl­edged that more). To my mind, the up­shot of this in­ves­ti­ga­tion is bet­ter de­scribed by this bul­let in the sum­mary than by the num­bers you quote:

• In this model, in most of the most plau­si­ble sce­nar­ios, THL ap­pears bet­ter than AMF. The differ­ence in cost-effec­tive­ness is usu­ally within 1 or 2 or­ders of mag­ni­tude. Un­der some sets of rea­son­able as­sump­tions, AMF looks bet­ter than THL. Be­cause we have so much un­cer­tainty, one could rea­son­ably be­lieve that AMF is more cost-effec­tive than THL or one could rea­son­ably be­lieve that THL is more cost-effec­tive than AMF.

• Thanks for this. I think this stems from the same is­sue as your nit­pick about AMF bring­ing about out­comes as good as sav­ing lives of chil­dren un­der 5. The Founders Pledge An­i­mal Welfare Re­port es­ti­mates that THL his­tor­i­cally brought about out­comes as good as mov­ing 10 hen-years from bat­tery cages to aviaries per dol­lar, so we took this as our start­ing point and that’s why this is framed in terms of mov­ing hens from bat­tery cages to aviaries. We should have been clearer about this though, to avoid sug­gest­ing that the only out­comes of THL are shifts from bat­tery cages to aviaries.

• Thanks for this com­ment, you raise a num­ber of im­por­tant points. I agree with ev­ery­thing you’ve writ­ten about QALYs and DALYs. We de­cided to frame this in terms of DALYs for sim­plic­ity and fa­mil­iar­ity. This was prob­a­bly just a bit con­fus­ing though, es­pe­cially as we wanted to con­sider val­ues of well-be­ing (much) less than 0 and, in prin­ci­ple, greater than 1. So maybe a generic unit of he­do­nis­tic well-be­ing would have been bet­ter. I think you’re right that this doesn’t mat­ter a huge amount be­cause we’re un­cer­tain over many or­ders of mag­ni­tude for other vari­ables, such as the moral weight of chick­ens.

The trade-off prob­lem is re­ally tricky. I share your scep­ti­cism about peo­ple’s ac­tual prefer­ences track­ing he­do­nis­tic value. We just took it for granted that there is a sin­gle, priv­ileged way to make such trade-offs but I agree that it’s far from ob­vi­ous that this is true. I had in mind some­thing like “a given ex­pe­rience has well-be­ing −1 if an ideal­ised agent/​an agent with the ex­pe­riencer’s ideal­ised prefer­ences would be in­differ­ent be­tween non-ex­is­tence and a life con­sist­ing of that ex­pe­rience as well as an ex­pe­rience of well-be­ing 1”. There are a num­ber of prob­lems with this con­cep­tion, in­clud­ing the is­sue that there might not be a sin­gle ideal­ised set of prefer­ences for these trade-offs, as you sug­gest. I think we needed to make some kind of as­sump­tion like this to get this pro­ject off the ground but I’d be re­ally in­ter­ested to hear thoughts/​see fu­ture dis­cus­sion on this topic!

# How good is The Hu­mane League com­pared to the Against Malaria Foun­da­tion?

29 Apr 2020 13:40 UTC
62 points
• Yes, feel­ing much bet­ter now for­tu­nately! Thanks for these thoughts and stud­ies, Derek.

Given our time con­straints, we did make some judge­ments rel­a­tively quickly but in a way that seemed rea­son­able for the pur­poses of de­cid­ing whether to recom­mend AfH. So this can cer­tainly be im­proved and I ex­pect your sug­ges­tions to be helpful in do­ing so. This con­ver­sa­tion has also made me think it would be good to ex­plore six monthly/​quar­terly/​monthly re­ten­tion rates rather than an­nual ones—thanks for that. :)

Our re­ten­tion rates for StrongMinds were also based partly on this study, but I wasn’t in­volved in that anal­y­sis so I’m not sure on the de­tails of the re­ten­tion rates there.

• Yes, we had phys­i­cal health prob­lems in mind here. I ap­pre­ci­ate this isn’t clear though—thanks for point­ing out. In­deed, we are aware of the un­der­es­ti­ma­tion of the bad­ness of men­tal health prob­lems and aim to take this into ac­count in fu­ture re­search in the sub­jec­tive well-be­ing space.

• Thanks very much for this thought­ful com­ment and for tak­ing the time to read and provide feed­back on the re­port. Sorry about the de­lay in re­ply­ing—I was ill for most of last week.

1. Yes, you’re ab­solutely right. The cur­rent bounds are very wide and they rep­re­sent ex­treme, un­likely sce­nar­ios. We’re keen to de­velop prob­a­bil­is­tic mod­els in fu­ture cost-effec­tive­ness analy­ses to pro­duce e.g. 90% con­fi­dence in­ter­vals and carry out sen­si­tivity analy­ses, prob­a­bly us­ing Guessti­mate or R. We didn’t have time to do so for this pro­ject but this is high on our list of method­olog­i­cal im­prove­ments.

2. Es­ti­mat­ing the re­ten­tion rates is challeng­ing so it’s helpful for us to know that you think our val­ues are too high. We based this pri­mar­ily on our re­ten­tion rate for StrongMinds, but ad­justed down­wards. It’s pos­si­ble we an­chored on this too much. How­ever, it’s not clear to me that our val­ues are too high. In par­tic­u­lar, if our best-guess re­ten­tion rate for AfH is too high, then this is prob­a­bly also true for StrongMinds. Since we’re us­ing StrongMinds as a bench­mark, this might not change our con­clu­sions very much.

The to­tal benefits are calcu­lated some­what con­fus­ingly and I ap­pre­ci­ate you haven’t had the chance to look at the CEA in de­tail. If is the effect di­rectly post-treat­ment and is the re­ten­tion rate, we calcu­lated the to­tal benefits as

That is, we as­sume half a year of full effect, and then dis­count each year that fol­lows by each time. We calcu­lated it in this way be­cause for StrongMinds, we had 6 month fol­low-up data. How­ever, it’s not clear that this ap­proach is best in this case. It might have been bet­ter to:

• As­sume 0.15 years at full effect

• Since the study has only an 8 week fol­low-up, as you mention

• As­sume some­where in be­tween 0.15 and 0.5 years at full effect

• Since the effects still looked very good at 8 week fol­low-up (albeit with no con­trol) and ev­i­dence from in­ter­ven­tions such as StrongMinds that sug­gest longer-last­ing effects still seems some­what relevant

Fi­nally, I think there are good rea­sons to pre­fer AfH over CBT in high-in­come coun­tries, even if our CEA sug­gests they are similarly cost-effec­tive­ness in terms of de­pres­sion. (Though they might not be strong enough to con­vince you that AfH and e.g. StrongMinds are similarly cost-effec­tive.)

• AfH aims to im­prove well-be­ing broadly, not just by treat­ing men­tal health prob­lems.

• Although much—per­haps most—of the benefits of AfH’s courses come from re­duc­tion in de­pres­sion, some of the benefits to e.g. hap­piness, life satis­fac­tion and pro-so­cial be­havi­our aren’t cap­tured by mea­sur­ing depression

• Our CEA is very con­ser­va­tive in some respects

• The effect sizes we used (af­ter our Bayesian anal­y­sis) are about 30% as large as re­ported in the study

• If CBT effects aren’t held to similar lev­els of scrutiny, then we can’t com­pare cost-effec­tive­ness fairly

• We think that the wider benefits of AfH’s scale-up could be very large

• We fo­cused just on the scale-up of the Ex­plor­ing What Mat­ters courses be­cause this is eas­iest to measure

• The hap­piness move­ment that AfH is lead­ing and grow­ing could be very benefi­cial, e.g. widely shar­ing ma­te­ri­als on AfH’s web­site, bring­ing (rel­a­tively small) benefits to a large num­ber of people

That said, I think it’s worth re­con­sid­er­ing our re­ten­tion rates when we re­view this fund­ing op­por­tu­nity. Thanks for your in­put.

3. This is cor­rect. We did not ac­count for the op­por­tu­nity cost of fa­cil­i­ta­tors’ or par­ti­ci­pants’ time. As always, there are many fac­tors and given time con­straints, we couldn’t ac­count for all of them. We thought that these costs would be small com­pared to the benefits of the course so we didn’t pri­ori­tise their in­clu­sion. I don’t think we ex­plic­itly men­tioned the op­por­tu­nity cost of time in the re­port though, so thanks for point­ing this out.

# Founders Pledge Char­ity Recom­men­da­tion: Ac­tion for Happiness

5 Mar 2020 11:27 UTC
35 points
(founderspledge.com)
• Scott Aaron­son and Giulio Tononi (the main ad­vo­cate of IIT) and oth­ers had an in­ter­est­ing ex­change on IIT which goes into the de­tails more than Muehlhauser’s re­port does. (Some of it is cited and dis­cussed in the foot­notes of Muehlhauser’s re­port, so you may well be aware of it already.) Here, here and here.

I do have some reser­va­tions about (var­i­ance) nor­mal­i­sa­tion, but it seems like a rea­son­able ap­proach to con­sider. I haven’t thought about this loads though, so this opinion is not su­per ro­bust.

Just to tie it back to the origi­nal ques­tion, whether we pri­ori­tise x-risk or WAS will de­pend on the agents who ex­ist, ob­vi­ously. Be­cause x-risk miti­ga­tion is plau­si­bly much more valuable on to­tal­ism than WAS miti­ga­tion is on other plau­si­ble views, I think you need al­most ev­ery­one to have very very low (in my opinion, un­jus­tifi­ably low) cre­dence in to­tal­ism for your con­lu­sion to go through. In the ac­tual world, I think x-risk still wins. As I sug­gested be­fore, it could be the case that the value of x-risk miti­ga­tion is not that high or even nega­tive due to s-risks (this might be your best line of ar­gu­ment for your con­clu­sion), but this sug­gests pri­ori­tis­ing large scale s-risks. You rightly pointed out that mil­lion years of WAS is the most con­crete ex­am­ple of s-risk we cur­rently have. It seems plau­si­ble that other and larger s-risks could arise in the fu­ture (e.g. large scale sen­tient simu­la­tions), which though ad­mit­tedly spec­u­la­tive, could be re­ally big in scale. I tend to think gen­eral foun­da­tional re­search aiming at im­prov­ing the tra­jec­tory of the fu­ture is more valuable to do to­day than WAS miti­ga­tion. What I mean by ‘gen­eral foun­da­tional re­search’ is not en­tirely clear, but, for in­stance, think­ing about and clar­ify­ing that seems more im­por­tant than WAS miti­ga­tion.

• I’m mak­ing a fresh com­ment to make some differ­ent points. I think our ear­lier thread has reached the limit of pro­duc­tive dis­cus­sion.

I think your the­ory is best seen as a metanor­ma­tive the­ory for ag­gre­gat­ing both well-be­ing of ex­ist­ing agents and the moral prefer­ences of ex­ist­ing agents. There are two dis­tinct types of value that we should con­sider:

pru­den­tial value: how good a state of af­fairs is for an agent (e.g. their level of well-be­ing, ac­cord­ing to util­i­tar­i­anism; their pri­or­ity-weighted well-be­ing, ac­cord­ing to pri­ori­tar­i­anism).

moral value: how good a state of af­fairs is, morally speak­ing (e.g. the sum of to­tal well-be­ing, ac­cord­ing to to­tal­ism; or the sum of to­tal pri­or­ity-weighted well-be­ing, ac­cord­ing to pri­ori­tar­i­anism).

The aim of a pop­u­la­tion ax­iol­ogy is to de­ter­mine the moral value of state of af­fairs in terms of the pru­den­tial value of the agents who ex­ist in that state of af­fairs. Each agent can have a prefer­ence or­der on pop­u­la­tion ax­iolo­gies, ex­press­ing their moral prefer­ences.

We could see your the­ory as look­ing at the pru­den­tial of all the agents in a state of af­fairs (their level of well-be­ing) and their moral prefer­ences (how good they think the state of af­fairs is com­pared to other state of af­fairs in the choice set). The moral prefer­ences, at least in part, de­ter­mine the crit­i­cal level (be­cause you take into ac­count moral in­tu­itions, e.g. that the sadis­tic re­pug­nant con­clu­sion is very bad, when set­ting crit­i­cal lev­els). So the crit­i­cal level of an agent (on your view) ex­presses moral prefer­ences of that agent. You then ag­gre­gate the well-be­ing and moral prefer­ences of agents to de­ter­mine over­all moral value—you’re ag­gre­gat­ing not just well-be­ing, but also moral prefer­ences, which is why I think this is best seen as a metanor­ma­tive the­ory.

Be­cause the crit­i­cal level is used to ex­press moral prefer­ences (as op­posed to purely dis­count­ing well-be­ing), I think it’s mis­lead­ing and the source of a lot of con­fu­sion to call this a crit­i­cal level the­ory—it can in­cor­po­rate crit­i­cal level the­o­ries if agents have moral prefer­ences for crit­i­cal level the­o­ries—but the the­ory is, or should be, much more gen­eral. In par­tic­u­lar, in de­ter­min­ing the moral prefer­ences of agents, one could (and, I think, should) take nor­ma­tive un­cer­tainty into ac­count, so that the ‘crit­i­cal level’ of an agent rep­re­sents their moral prefer­ences af­ter moral un­cer­tainty. Ag­gre­gat­ing these moral prefer­ences means that your the­ory is ac­tu­ally a two-level metanor­ma­tive the­ory: it can (and should) take stan­dard nor­ma­tive un­cer­tainty into ac­count in de­ter­min­ing the moral prefer­ences of each agent, and then ag­gre­gates moral prefer­ences across agents.

Hope­fully, you agree with this char­ac­ter­i­sa­tion of your view. I think there are now some things you need to say about de­ter­min­ing the moral prefer­ences of agents and how they should be ag­gre­gated. If I un­der­stand you cor­rectly, each agent in a state of af­fairs looks at some choice set of states of af­fairs (states of af­fairs that could ob­tain in the fu­ture, given cer­tain choices?) and comes up with a num­ber rep­re­sent­ing how good or bad the state of af­fairs that they are in is. In par­tic­u­lar, this num­ber could be nega­tive or pos­i­tive. I think it’s best just to ag­gre­gate moral prefer­ences di­rectly, rather than pre­tend­ing to use crit­i­cal lev­els that we sub­tract from lev­els of well-be­ing, and then ag­gre­gate ‘rel­a­tive util­ity’, but that’s not an im­por­tant point.

I think the coice-set de­pen­dence of moral prefer­ences is not ideal, but I imag­ine you’ll dis­agree with me here. In any case, I think a similar the­ory could speci­fied that doesn’t rely on this choice-set de­pen­dence, though I imag­ine it might be harder to avoid the con­clu­sions you aim to avoid, given choice-set in­de­pen­dence. I haven’t thought about this much.

You might want to think more about whether sum­ming up moral prefer­ences is the best way to ag­gre­gate them. This form of ag­gre­ga­tion seems vuln­er­a­ble to ex­treme prefer­ences that could dom­i­nate lots of mild prefer­ences. I haven’t thought much about this and don’t know of any liter­a­ture on this di­rectly, but I imag­ine vot­ing the­ory is very rele­vant here. In par­tic­u­lar, the the­ory I’ve de­scribed looks just like a score vot­ing method. Per­haps, you could place bounds on scores/​moral prefer­ences some­how to avoid the dom­i­nance of very strong prefer­ences, but it’s not im­me­di­ately clear to me how this could be done jus­tifi­ably.

It’s worth not­ing that the re­sult­ing the­ory won’t avoid the sadis­tic re­pug­nant con­clu­sion un­less ev­ery agent has very very strong moral prefer­ences to avoid it. But I think you’re OK with that. I get the im­pres­sion that you’re will­ing to ac­cept it in in­creas­ingly strong forms, as the pro­por­tion of agents who are will­ing to ac­cept it in­creases.