I’m a computational physicist, I generally donate to global health. I am skeptical of AI x-risk and of big R Rationalism, and I intend explaining why in great detail.
titotal
This seems to me like an attempt to run away from the premise of the thought experiment. I’m seeing lot’s of “maybes” and “mights” here, but we can just explain them away with more stipulations: You’ve only seen the outside of their ship, you’re both wearing spacesuits that you can’t see into, you’ve done studies and found that neuron count and moral reasoning skills are mostly uncorrelated, and that spacefilight can be done with more or less neurons, etc.
None of these avert the main problem: The reasoning really is symmetrical, so both perspectives should be valid. The EV of saving the alien is 2N, where N is the human number of neurons, and the EV of saving the human from the alien perspective is 2P, where P is the is alien number of neurons. There is no way to declare one perspective the winner over the other, without knowing both N and P. Remember in the original two envelopes problem, you knew both the units, and the numerical value in your own envelope: this was not enough to avert the paradox.
See, the thing that’s confusing me here is that there are many solutions to the two envelope problem, but none of them say “switching actually is good”. They are all about how to explain why the EV reasoning is wrong and switching is actually bad. So in any EV problem which can be reduced to the two envelope problem, you shouldn’t switch. I don’t think this is confined to alien vs human things either: perhaps any situation where you are unsure about a conversion ratio might run into two envelopy problems, but I’ll have to think about it.
I think switching has to be wrong, for symmetry based reasons.
Let’s imagine you and a friend fly out on a spaceship, and run into an alien spaceship from an another civilisation that seems roughly as advanced as you. You and your buddy have just met the alien and their buddy but haven’t learnt each others languages, when an accident occurs: your buddy and their buddy go flying off in different directions and you collectively can only save one of them. The human is slightly closer and a rescue attempt is slightly more likely to be successful as a result: based solely on hedonic utilitarianism, do you save the alien instead?
We’ll make it even easier and say that our moral worth is strictly proportional to number of neurons in the brain, which is an actual, physical quantity.
I can imagine being an EA-style reasoner, and reasoning as follows: obviously I should anchor that the alien and humans have equal neuron counts, at level N. But obviously there’s a lot of uncertainty here. Let’s approximate a lognormal style system and say theres a 50% chance the alien is also level N, a 25% chance they have N/10 neurons, and a 25% chance they have 10N neurons. So the expected number of neurons in the alien is 0.25*N/10 + 0.5*N + 0.25*(10N) = 3.025N. Therefore, the alien is worth 3 times as much a human in expectation, so we should obviously save it over the human.
Meanwhile, by pure happenstance, the alien is also a hedonic EA-style reasoner with the same assumptions, with neuron count P. They also do the calculation, and reason that the human is worth 3.025P, so we should save the human.
Clearly, this reasoning is wrong. The cases of the alien and human are entirely symmetric: both should realise this and rate each other equally, and just save whoevers closer.
If your reasoning gives the wrong answer when you scale it up to aliens, it’s probably also giving the wrong answer for chickens and elephants.
If we make reasoning about chickens that is correct, it should also be able to scale up to aliens without causing problems. If your framework doesn’t work for aliens, that’s an indication that something is wrong with it.
Chickens don’t hold a human-favouring position because they are not hedonic utilitarians, and aren’t intelligent enough to grasp the concept. But your framework explicitly does not weight the worth of beings by their intelligence, only their capacity to feel pain.
I think it’s simply wrong to switch in the case of the human vs alien tradeoff, because of the inherent symmetry of the situation. And if it’s wrong in that case, what is it about the elephant case that has changed?
So in the two elephants problem, by pinning to humans are you affirming that switching from the 1 human EV to 1 elephant EV, when you are unsure about the HEV to EEV conversion, actually is the correct thing to do?
Like, option 1 is 0.25 HEV better than option 2, but option 2 is 0.25 EEV better than option 1, but you should pick option 1?
what if instead of an elephant, we were talking about a sentient alien? Wouldn’t they respond to this with an objection like “hey, why are you picking the HEV as the basis, you human-centric chauvinist?”
Maybe it’s worth pointing out that Bostrom, Sandberg, and Yudkowsky were all in the same extropian listserv together (the one from the infamous racist email), and have been collaborating with each other for decades. So maybe it’s not precisely a geographic distinction, but there is a very tiny cultural one.
A couple of astronauts hanging out on a dome on mars is not the same thing as an interplanetary civilization. I expect mars landings to follow the same trajectory as the moon landings: put a few people on there for the sake of showing off, then not bother about it for half a century, then half-assedly discuss putting people on there long term, again for the sake of showing off.
I recommend the book A city on mars for an explanation of the massive social and economic barriers to space colonisation.
I hope you don’t take this the wrong way, but this press release is badly written, and it will hurt your cause.
I know you say you’re talking about more than extinction risks, but when you put: “The probability of AGI causing human extinction is greater than 99%” in bold and red highlight, that’s all anyone will see. And then they can go on to check what experts think, and notice that only a fringe minority, even among those concerned with AI risk, believe that figure.
By declaring your own opinion as the truth, over that of experts, you come off like an easily dismissible crank. One of the advantages of the climate protest movements is that they have a wealth of scientific work to point to for credibility. I’m glad you are pointing out current day harms later on in the article, but by then it’s too late and everyone will have written you off.
In general, there are too many exclamation points! It comes off as weird and offputting! and RANDOMLY BREAKING INTO ALLCAPS makes you look like you’re arguing on an internet forum. And there’s way too long paragraphs full of confusing phrases that are not understandable by a layperson.
I suggest you find some people who have absolutely zero exposure to AI safety or EA at all, and run these and future documents by them for ideas on improvements.
Explaining the discrepancies in cost effectiveness ratings: A replication and breakdown of RP’s animal welfare cost effectiveness calculations
No worries, and I have finally managed to replicate Laura’s results, and find the true source of disagreement. The key factor missing was the period of egg laying: I put in ~1 year year for both Caged and uncaged, as is assumed on the site that provided the hours of pain figures. This 1 year of laying period assumption seems to match with other sources. Whereas in the causal model, the caged length of laying is given as 1.62 years, and the cage free length of laying is given as 1.19 years. The causal model appears to have tried to calculate this, but it makes more sense to me to use the site that measured the pains estimate: I feel they made they measurements, they are unlikely to be 150% off, and we should be comparing like with like here.
When I took this into account, I was able to replicate Lauras results, which I have summarised in this google doc, which also contains my own estimate and another analysis for broilers, as well as the sources for all the figures.
My DALY weights were using the geometric means (I wasn’t sure how to deal with lognormal), but switching to regular averages like you suggest makes things match better.
Under lauras laying period, my final estimate is 3742 Chicken-Dalys/thousand dollars, matching well with the causal number of 3.5k (given i’m not using distributions). Discounting this by the 0.332 figure from moral weights (this includes sentience estimates, right?) gives a final DALY’s per thousand of 1242 (or 1162 if we use the 3.5k figure directly)
Under my laying period figures, the final estimate is 6352 Chicken-Dalys/thousand, which discounted by the RP moral weights comes to 2108 DALYs/thousand dollars. A similar analysis for broilers gives 1500 chicken-dalys per thousand dollars and 506 DALY’s per thousand dollars.
The default values from the cross cause website should match with either Laura’s or mines estimates.
I agree that when it comes to decision making, Leifs objection doesn’t work very well.
However, when it comes to communication, I think there is a point here (although I’m not sure it was the one Leif was making). If Givewell communicates about the donation and how many lives you saved, and don’t mention the aid workers and mothers who put up nets, aren’t they selling them short here, and dismissing their importance?
In Parfits experiment, obviously you should go on the four person mission and help save the hundred lives. But if you then went on to do a book tour and touted what a hero you are for saving the hundred lives, and don’t mention the other three people, you are being a jerk.
I could imagine an aid worker in Uganda being kind of annoyed that they spent weeks working full time in sweltering heat handing out malaria nets for low pay, and then watching some tech guy in america take all the credit for the lifesaving work. It could hurt EA’s ability to connect with the third world.
Your initial post claimed that RP thought AW was 1000x more effective than GHD. I just thought I’d flag that in their subsequent analyses, they have reported much lower numbers. In this report, (if i’m reading it right), they put a chicken campaign at ~1200 and AMF at ~20, a factor of 60, much lower than 1000x, disagreeing greatly with Vasco’s analysis which you linked. (ALl of these are using Saulius numbers and RP’s moral weights).
If you go into their cross cause calculator, the givewell bar is ~20 while the generic chicken campaign gives ~700 with default parameters, making AW only 35 times as effective as GHD.
I’ve been attempting replicating the results in the comments of this post, and my number comes up higher as ~2100 vs ~20, making AW 100 times as effective. Again, these are all using Saulius report and RP’s moral weights: if you disagree with these substantially GHD might come out ahead.
Okay, I was looking at the field DALYs per bird per year” in this report, which is 0.2 matching with I have replicated. The 0.23 figure is actually something else, which explains a lot of the confusion in this conversation. I’ll include my calculation at the end.
Before I continue, I want to thank you for being patient and working with me on this. I think people are making decisions based on these figures so it’s important to be able to replicate them.
This report states that Saulius’s numbers are being used:
I estimate the “chicken-DALYs” averted per $1000 spent on corporate campaigns, conditioned on hens being sentient. To do so, I use Šimčikas’ rport, data from the Welfare Footprint Project on the duration of welfare harms in conventional and cage-free environments, and intensity weights by type of pain.
I think I’ve worked it out: if we take the 2.18 birds affected per year and multiply by the 15 year impact, we get 32.7 chicken years affected/dollar , which is the same as the 54 chicken years given by saulius discounted by 40% (54*0.6 = 32.4). This is the number that goes into the 0.23 figure, and this does already take into account the 15 years of impact.
So I don’t get why there’s still a discrepancy: although we take different routes to get there, we have the same numbers and should be getting the same results.
My calculation, taken from here.
laying time is 40 to 60 weeks, so we’ll assume it goes for exactly 1 year.
Disabiling: 430-156 = 274 hours disabling averted
Hurtful = 4000-1741 = 2259 hours hurtful averted.
Annoying 6721-2076 =4645 hours annoying averted.
Total DALYs averted:
4.47*274/(365*24) = 0.14 disabling DALYS averted
0.15*2259/(365*24) = 0.0386 hurtful DALYS averted
0.015* 4645/(365*24) =0.00795hurtful Dalys averted
Total is about 0.19 DALY’s averted per hen per year.
Laura’s numbers already take into account the number of chickens affected. The 0.23 figure is a total effect to all chickens covered per dollar per year. To get the effect per $1000, we need to multiply by the number of years the effect will last and by 1000. Laura assumes a log normal distribution for the length of the effect that averages to about 14 years. So roughly, 0.23 * 14 * 1000 = 3220 hen DALYs per 1000 dollars.
I’m sorry, but this just isn’t true. You can look at the field for “annual CC DALYs per bird per year” here (with the 0.2 value), it does not include Saulius’s estimates. (I managed to replicate the value and checked it against the fields here, they match).
Saulius’s estimates already factor in the 14 year effect of the intervention. You’ll note that the “chickens affected per dollar” is multiplied by the mean years of impact when giving out the “12 to 160” result.
Saulius is saying that each dollar affects 54 chicken years of life, equivalent to moving 54 chickens from caged to cage free environments for a year. The DALY conversion is saying that, in that year, each chicken will be 0.23 DALY’s better off. So in total, 54*0.23 = 12.43 DALYs are averted per dollar, or 12430 DALYS per thousand, as I said in the last comment. However, I did notice in here that the result was deweighted by 20%-60% because they expected future campaigns to be less effective, which would bring it down to around 7458.
I didn’t factor in the moral conversions because those are seperate fields in the site. If I use P(sentience) of 0.8 and moral weight of 0.44 as the site defaults to, the final DALy per thousand should be 7458*0.8*0.44= 2386 DALYs/thousand dollars, about three times more than the default value on the site.
I’m sympathetic, but to make the counterpoint: EA needs some way to protect against bullshit.
Scientists gatekeep publication behind peer review. Wikipedia requires that every claim be backed up with a source. Journalists employ fact checkers. None of these are in any way perfect (and are often deeply flawed), but the point is that theoretically, at least one qualified person with expertise in the subject has checked over what has been written for errors.
In contrast, how does EA ensure that the claims made here are actually accurate? Well, we first hope that people are honest and get everything right initially, but of course that can never ensure anything. The main mechanism relied upon is that some random reader will bother to read an article closely enough to spot errors in it, and then write a criticism calling the error out in the comments, or write up their own post calling out said error. Of course this is sometimes acrimonious. But if we don’t put up the criticism, the BS claims will cement themselves, and start influencing actual real world decisions that affect millions of dollars and peoples lives.
If we stop “sanctifying” criticism, then what exactly is stopping BS from taking over the entire movement (if it hasn’t already)? I’ve certainly seen actually good criticism dismissed as bad criticism because the author misunderstood their critique, or differed in assumptions. If you’re going to rely on criticism as the imperfect hammer to root out bullshit nails, you kinda have to give it a special place.
I list exactly 2 criticisms. One of them was proven correct, the other I believe to be correct also but am waiting on a response.
I agree with the asymettry in the cost of waiting, but the other way. If these errors are corrected a week from now, after the debate week has wrapped up, then everybody will have stopped paying attention to the debate, and it will become much harder to correct any BS arising from the faulty tool.
Do you truly not care that people are accidentally spreading misinformation here?
In the linked thread, the website owners have confirmed that there is indeed an error in the website. If you try to make calculations using their site as currently made you will be off by a factor of a thousand. They have confirmed this and have stated that this will be fixed soon. When it is fixed I will edit the shortform.
Would you prefer that for the next couple of days, during the heavily publicised AW vs GHD debate week, in which this tool has been cited multiple times, people continue to use it as is despite it being bugged and giving massively wrong results? Why are you not more concerned about flawed calculations being spread than about me pointing out that flawed calculations are being spread?
Thanks, hope the typos will be fixed. I think I’ve almost worked through everything to replicate the results, but the default values still seem off.
If I take sallius’s median result of 54 chicken years life affected per dollar, and then multiply by Laura’s conversion number of 0.23 DALYs per $ per year, I get a result of 12.4 chicken years life affected per dollar. If I convert to DALY’s per thousand dollars, this would result in a number of 12,420.
This is outside the 90% confidence interval for the defaults given on the site, which gives it as “between 160 and 3.6K suffering-years per dollar”. If I convert this to the default constant value, it gives the suggested value of 1,900, which is roughly ten time lower than the value if I take Sallius’s median and laura’s conversion factor.
If I put in the 12420 number into the field, the site gives out 4630 DALY’s per thousand dollars, putting it about 10 times higher than originally stated in the post, which seems more in line with other RP claims (after all, right now the chicken campaign is presented as only 10 times more cost effective, whereas others are claiming it’s 1000x more effective using RP numbers).
This parameter is set to a normal distribution (which, unfortunately you can’t control) and the normal distribution doesn’t change much when you lower the lower bound. A normal distribution between 0.002 and 0.87 is about the same as a normal distribution between 0 and 0.87. (Incidentally, if the distribution were a lognormal distribution with the same range, then the average result would fall halfway between the bounds in terms of orders of magnitude. This would mean cutting the lower bound would have a significant effect. However, the effect would actually raise the effectiveness estimate because it would raise the uncertainty about the precise order of magnitude. The increase of scale outside the 90% confidence range represented by the distribution would more than make up for the lowering of the median.)
The upper end of the scale is already at ” a chicken’s suffering is worth 87% of a humans”. I’m assuming that very few people are claiming that a chickens suffering is worth more than a humans. So wouldn’t the lognormal distribution be skewed to account for this, meaning that the switch would substantially change the results?
Can I ask how you arrived at the “millionths” number?
If we’re listing factors in EA leading to mental health problems, I feel like it’s worth pointing that a portion of EA thinks there’s a high chance of an imminent AI apocalypse that will kill everybody.
I myself don’t believe this at all, but to the people that do believe this, there’s no way it doesn’t affect your mental health.