Same content on bluesky, for those avoiding twitter now: https://āābsky.app/āāprofile/āāwdmacaskill.bsky.social/āāpost/āā3lcdlb4lbdk2m
tobycrisford šø
I think this is a fascinating area, and the problems youāve highlighted seem like important problems. I find it hard to believe itās a cause area EAs should focus on though.
As you explain, the clearest threat is the impact on cryptography, but it doesnāt seem likely to me that that problem is neglected. There are huge incentives for governments and companies to solve that problem, and I think they are probably already doing lots of work on it..?
A question jumped out at me when reading these results. I should caveat this by emphasizing that I am very much not an expert in this kind of evaluation and this question may be naive.
Is there any seasonal effect on mortality in Malawi? If so, is it ok for the pre-intervention period to be 12-months while the post-intervention period is 18-months?
This is very helpful, thanks!
If youāre correct in the linked analysis, this sounds like a really important limitation in ACEās methodology, and Iām very glad youāve shared this!
In case anyone else has the same confusion as me when reading your summary: I think there is nothing wrong with calculating a charityās cost effectiveness by taking the weighted sum of the cost-effectiveness of all of their interventions (weighted by share of total funding that intervention receives). This should mathematically be the same as (Total Impact /ā Total cost), and so should indeed go up if their spending on a particular intervention goes down (while achieving the same impact).
The (claimed) cause of the problem is just that ACEās cost-effectiveness estimate does not go up by anywhere near as much as it should when the cost of an intervention is reduced, leading the cost-effectiveness of the charity as a whole to actually change in the wrong direction when doing the above weighted sum!
If this is true it sounds pretty bad. Would be interested to read a response from them.
Of course, the other thing that could be going on here, is that average cost-effectiveness is not the same as cost-effectiveness on the margin, which is presumably what ACE should care about. Though I donāt see why an intervention representing a smaller share of a charityās expenditure should automatically mean that this is not where extra dollars would be allocated. The two things seem independent to me.
I would be very interested to read a summary of what Tyler Cowen means by all this!
I know it was left as an exercise for the reader, but if someone wants to do the work for me it would be appreciated :)
Makes sense, thank you for the reply and clarification!
This is a fascinating summary!
I have a bit of a nitpicky question on the use of the phrase āconfidence intervalsā throughout the report. Are these really supposed to be interpreted as confidence intervals? Rather than the Bayesian alternative, ācredible intervalsā..?
My understanding was that the phrase āconfidence intervalā has a very particular and subtle definition, coming from frequentist statistics:
80% Confidence Interval: For any possible value of the unknown parameter, there is an 80% chance that your data-collection and estimation process would produce an interval which contained that value.
80% Credible interval: Given the data you actually have, there is an 80% chance that the unknown parameter is contained in the interval.
From my reading of the estimation procedure, it sounds a lot more like these CIs are supposed to be interpreted as the latter rather than the former? Or is that wrong?
Appreciate this is a bit of a pedantic question, that the same terms can have different definitions in different fields, and that discussions about the definitions of terms arenāt the most interesting discussions to have anyway. But the term jumped out at me when reading and so thought I would ask the question!
This is a really interesting post, and I appreciate how clearly it is laid out. Thank you for sharing it! But Iām not sure I agree with it, particularly the way that everything is pinned to the imminent arrival of AGI.
Firstly, the two assumptions you spell out in your introduction, that AGI is likely only a few years away, and that it will most likely come from scaled up and refined versions of moden LLMs, are both much more controversial than you suggest (I think)! (Although Iām not confident they are false either)
But even if we accept those assumptions, the third big assumption here is that we can alter a superintelligent AGIās values in a predictable and straightforward way by just adding in some synethetic training data which expresses the views we like, when building some of its component LLMs. This seems like a strange idea to me!
If we removed some concept from the training data completely, or introduced a new concept that had never appeared otherwise, then I can imagine that having some impact on the AGIās behaviour. But if all kinds of content are included in significant quantities anyway, then i find it hard to get my head around the inclusion of additional carefully chosen synthetic data having this kind of effect. I guess it clashes with my understanding of what a superintelligent AGI means, to think that its behaviour could be altered with such simple manipulation.
I think an important aspect of this is that even if AGI does come from scaling up and refining LLMs, it is not going to just be a LLM in a straightforward definition of that term (i.e. something that communicates by generating each word with a single forward pass through a neural network). At the very least it must also have some sort of hidden internal monologue where it does chain of thought reasoning, and stores memories, etc.
But I donāt know much about AI alignment, so would be very interested to read and understand more about the reasoning behind this third assumption.
All that said, even ignoring AGI, LLMs are likely going to be used more and more in peopleās every day lives over the next few years, so training them to express kinder views towards animal seems like a potentially worthwhile goal anyway. I donāt think AGI needs to come into it!
I agree that we can imagine a similar scenario where your identity is changed to a much lesser degree. But Iām still not convinced that we can straightforwardly apply the Platinum rule to such a scenario.
If your subjective wellbeing is increased after taking the pill, then one of the preferences that must be changed is your preference not to take the pill. This means that when we try to apply the Platinum rule: ātreat others as they would have us treat themā, we are naturally led to ask: āas they would have us treat them when?ā If their preference to have taken the pill after taking it is stronger than their preference not to take the pill before taking it, the Platinum rule becomes less straightforward.
I can imagine two ways of clarifying the rule here, to explain why forcing someone to take the pill would be wrong, which you already allude to in your post:
We should treat others as they would have us treat them at the time we are making the decision. But this would imply that if someoneās preferences are about to naturally, predictably, change for the rest of their life, then we should disregard that when trying to decide what is best for them, and only consider what they want right now. This seems much more controversial than the original statement of the rule.
We should treat others as they would have us treat them, considering the preferences they would have over their lifetime if we did not act. But this would imply that if someone was about to eat the pill by accident, thinking it was just a sweet, and we knew it was against their current wishes, then we should not try to stop them or warn them. This would create a very odd action/āinaction distinction. Again, this seems much more controversial than the original statement of the rule.
In the post you say the Platinum rule might be the most important thing for a moral theory to get right, and I think I agree with you on this. It is something that seems so natural and obvious that I want to take it as a kind of axiom. But neither of these two extensions to it feel this obvious any more. They both seem very controversial.
I think the rule only properly makes sense when applied to a person-moment, rather than to a whole person throughout their life. If this is true, then I think my original objection still applies. We arenāt dealing with a situation where we can apply the platinum rule in isolation. Instead, we have just another utilitarian trade-off between the welfare of one (set of) person(-moments) and another.
This was a really thought-provoking read, thank you!
I think I agree with Richard Chappellās comment that: āthe more you manipulate my values, the less the future person is meā.
In this particular case, if I take the pill, my preferences, dispositions, and attitudes are being completely transformed in an instant. These are a huge part of what makes me who I am, so I think that after taking this pill I would become a completely different person, in a very literal sense. It would be a new person who had access to all of my memories, but it would not be me.
From this point of view, there is no essential difference between this thought experiment, and the common objection to total utilitarianism where you consider killing one person and replacing them with someone new, so that total well-being is increased.
This is still a troubling thought experiment of course, but I think it does weaken the strength of your appeal to the Platinum rule? We are no longer talking about treating a person differently to how they would want to be treated, in isolation. We just have another utilitarian thought experiment where we are considering harming person X in order to benefit a different person Y.
And I think my response to both thought experiments is the same. Killing a person who does not want to be killed, or changing the preferences of someone who does not want them changed, does a huge amount of harm (at least on a preference-satisfaction version of utilitarianism), so the assumption in these thought experiments that overall preference satisfaction is nevertheless increased is doing a lot of work, more work than it might appear at first.
I really like this thought experiment, thank you for sharing!
Personally, I agree with you, and I think the answer to your headline question is: yes! Your reasoning makes sense to me anyway. (At least if we donāt combine the Self-Sampling Assumption with another assumption like the Self-Indication Assumption as well).
I think that your example is essentially equivalent to the Doomsday argument, or the Adam+Eve paradox, see here: https://āāanthropic-principle.com/āāpreprints/āācau/āāparadoxes But I like that your thought experiment really isolates the key problem and puts precise numbers on it!
I havenāt digested the full paper yet, but based on the summary pasted below, this is precisely the claim I was trying to argue for in the āAgainst Anthropic Shadowā post of mine that you have linked.
It looks like this claim has been fleshed out in a lot more detail here though, and Iām looking forward to reading it properly!
In the post you linked I also went on quite a long digression trying to figure out if it was possible to rescue Anthropic Shadow by appealing to the fact that there might be large numbers of other worlds containing life (this plausibly weakens the strength of evidence provided by A, which may then stop the cancellation in C). I decided it technically was possible, but only if you take a strange approach to anthropic reasoning, with a strange and difficult-to-define observer reference class.
Possibly focusing so much on this digression was a mistake though, since the summary above is really pointing to the important flaw in the original argument!
This is a fantastic answer, thank you!
I think (2) is the relevant one here. Maybe in the not too distant future there will be a massive shift in global public opinion, and the farming of animals (at least at industrial scale) will become a thing of the past. If you think most farmed animals lead lives so bad that they would be better off not being born, then the impact of this change would be huge. (And if youāre a non-consequentialist vegan who doesnāt like to view the issue in these terms, then itās harder to quantify the impact, but you probably care even more about doing everything possible to make this scenario happen)
I think this is what is hoped for by the vegans who prioritise outreach. The idea would be that outreach either increases the probability of this scenario becoming reality, or it means that this scenario happens sooner than it otherwise would. I think this is a conceivable way that vegan outreach could have the kind of huge, hard to measure, benefit youāre talking about.
Of course thereās a whole argument to be had here. Iām sure lots of people would find this scenario so implausible as to not be worth considering (or they would think it will only happen if and when we get good cheap lab grown meat, or that we canāt do anything to influence if and when it happensā¦ etc).
I wasnāt really trying to start that argument with this question, but just asking what someone who wants to give some weight to this argument in their donations should do.
Sure, but once youāve assumed that already, you donāt need to rely any more on an argument about shifts to P(X_1 > x) being cancelled out by shifts to P(X_n > x) for larger n (which if I understand correctly is the argument youāre making about existential risk).
If P(X_N > x) is very small to begin with for some large N, then it will stay small, even if we adjust P(X_1 > x) by a lot (we canāt make it bigger than 1!) So we can safely say under your assumption that adjusting the P(X_1 > x) factor by a large amount does influence P(X_N > x) as well, itās just that it canāt make it not small.
The existential risk set-up is fundamentally different. We are assuming the future has astronomical value to begin with, before we intervene. That now means non-tiny changes to P(Making it through the next year) must have astronomical value too (unless there is some weird conspiracy among the probability of making it through later years which precisely cancels this out, but that seems very weird, and not something you can justify by pointing to global health as an analogy).
I donāt see why the same argument holds for global health interventions....?
Why should X_N > x require X_1 > x....?
Thanks a lot for this answer! That sounds very plausible.
I think a lot depends here on whether:
i) We think there may well be a meaningful effect for vegan education initiatives but we canāt measure it in a controlled experiment, or
ii) We think there is no meaningful effect for currently popular vegan education initiatives.
(By āmeaningfulā, I basically mean an effect big enough that I might consider donating, which is admittedly a bit vague)
I think CC makes a good point. Whichever of these possibilities is true, it feels like there is still scope for someone interested in vegan outreach to do something useful with their donations. If (i), then we could fund research into alternative non-experimental ways of comparing existing vegan outreach interventions (EAs are often happy funding things on the basis of weaker evidence than RCTs). If (ii), then we could fund research to investigate alternative kinds of interventions that havenāt been considered yet (or has everything been considered?) If unsure between (i) and (ii), we can do both!
Maybe there is already research on these questions that we could use as well. Iāve been doing some more digging and found this survey of vegans, linked to from Faunalytics: https://āāvomad.life/āāsurvey/āā#about-your-veganism This seems like a decent non-experimental way of finding out which factors might influence someone to go vegan.
On the basis of this survey, maybe some effective vegan outreach interventions would be:
Funding advertising campaigns for Veganuary
Funding the production and/āor marketing of vegan documentaries
Funding the production and/āor marketing of online videos with a vegan message
hroughCorrect me if I am wrong, but I think you are suggesting something like the following. If there is a 99 % chance we are in future 100 (U_100 = 10^100), and a 1 % (= 1 ā 0.99) chance we are in future 0 (U_0 = 0), i.e. if it is very likely we are in an astronomically valuable world[1], we can astronomically increase the expected value of the future by decreasing the chance of future 0. I do not agree. Even if the chance of future 0 is decreased by 100 %, I would say all its probability mass (1 pp) would be moved to nearby worlds whose value is not astronomical. For example, the expected value of the future would only increase by 0.09 (= 0.01*9) if all the probability mass was moved to future 1 (U_1 = 9).
The claim you quoted here was a lot simpler than this.
I was just pointing out that if we take an action to increase near-term extinction risk to 100% (i.e. we deliberately go extinct), then we reduce the expected value of the future to zero. Thatās an undeniable way that a change to near-term extinction risk can have an astronomical effect on the expected value of the future, provided only that the future has astronomical expected value before we make the intervention.
It is not that I expect us to get worse at mitigation.
But this is more or less a consequence of your claims isnāt it?
The cost of moving physical mass increases with distance, and I guess the cost of moving probability mass increases (maybe exponentially) with value-distance (difference between the value of the worlds).
I donāt see any basis for this assumption. For example, it is contradicted by my example above, where we deliberately go extinct, and therefore move all of the probability weight from U_100 to U_0, despite their huge value difference.
Or I suppose maybe I do agree with your assumption (as canāt think of any counter-examples I would actually endorse in practice) I just disagree with how youāre explaining its consequences. I would say it means the future does not have astronomical expected value, not that it does have astronomical value but that we canāt influence it (since it seems clear we can if it does).
(If I remember our exchange on the Toby Ord post correctly, I think you made some claim along the lines of: there are no conceivable interventions which would allow us to increase extinction risk to ~100%. This seems like an unlikely claim to me, but itās also I think a different argument to the one youāre making in this post anyway.)
Hereās another way of explaining it. In this case the probability p_100 of U_100 is given by the huge product:
P(making it through next year) X P(making it through the year after given we make it through year 1) X ā¦..ā¦ etc
Changing near-term extinction risk is influencing the first factor in this product, so it would be weird if it didnāt change p_100 as well. The same logic doesnāt apply to the global health interventions that youāre citing as an analogy, and makes existential risk special.In fact I would say it is your claim (that the later factors get modified too in just such a special way as to cancel out the drop in the first factor) which involves near-term interventions having implausible effects on the future that we shouldnāt a priori expect them to have.
I hope you are right, but is there evidence that veganism is growing exponentially?