Making AI Welfare an EA priority requires justifications that have not been given

Author’s Note: Written in a slightly combative tone [1]as I have found the arguments for the proposition this week to be insufficiently compelling for the debate statement at hand. Also, I’m very rushed getting this out in time so with more time I would probably have focused more on the ideas and had more time to add nuance and caveats. I apologise in advance for my shortcomings, and hope you can take the good parts of it and overlook the bad.

Parsing the Debate statement correctly means that supporting it entails supporting radical changes to EA

The statement for AI Welfare Debate Week (hereafter AWDW) is “AI welfare should be an EA priority”. However, expanding this with the clarifications provided by the Forum team leads to the expanded statement: “5%+ of unrestricted EA talent and funding should be focused on the potential well-being of future artificial intelligence systems”.

Furthermore, I’m interpreting this as a “right now course of action” claim and not an “in an ideal world wouldn’t it be nice if” claim. A second interpretation I had about AWDW was that posters were meant to argue directly for the proposition instead of providing information to help voters make up their minds. I think, in either case, though especially the first, the argument for the proposition has been severely underargued.

To get even more concrete, I estimate the following:

  • As a rough estimate for the number of EAs, I take the number of GWWC Pledgers even if they’d consider themselves ‘EA-Adjacent’.[2] At my last check, the lifetime members page stated there were 8,983 members, so 5% of that would be ~449 EAs working specifically or primarily on the potential well-being of future artificial intelligence systems.

  • For funding, I indexed on Tyler Maule’s 2023 estimates of EA funding. That stood at $980.8M in estimated funding, so 5% of that would be ~$49.04M in yearly funding spent on AI Welfare.

  • This is obviously a quick and dirty method, but given the time constraints I hope it’s in the rough order of magnitude of the claims that we’re talking about.

  • Furthermore, I think the amount of money and talent that are spent on AI Welfare in EA is already is quite low, so unless one thinks there can be an influx of new talent and donors to EA specifically to work on AI Welfare then this re-prioritisation must necessarily come at the cost of other causes that EA cares about.[3]

These changes can only be justified if the case to do so is strongly justified

This counterfactual impact on other EA causes cannot, therefore, be separated from arguments for AI Welfare. In my opinion, one of the Forum’s best ever posts is Holly Elmore’s We are in triage every second of every day. Engaging with Effective Altruism should help make us all more deeply realise that the counterfactual costs of our actions can be large. To me, making such a dramatic and sudden shifts to EA priorities would require strong justifications, especially given the likely high counterfactual costs of the change.[4]

As an example, Tyler estimated that 2023 EA funding for Animal Welfare was around ~$54M. In the world where AI Welfare was made a priority as per the statement definition then it would likely gain some resources at the expense of Animal Welfare, and plausibly become a higher EA priority by money and talent. This is a result I would prima facie think that many or most EAs would not support, and so I wonder if all of those who voted strongly or relatively in favour of AWDW’s proposition fully grasped the practical implications of their view.

Most posts on AI Welfare Debate Week have failed to make this case

The burden of proof for prioritising AI Welfare requires stronger arguments

However, arguments this week seem to have not made the positive case at all. In fact, while reading (and skim reading) other posts I didn’t find any specific reference to the proposition of AWDW. If so, the baseline should be that they haven’t made a good argument for it.

In some cases this is likely because they were written beforehand (or pre-briefed to be about the topic of AI Welfare). Perhaps, as I mentioned above, this was not meant to be what the posts were discussing, in which case the misinterpretation is on my end. But even in this case, I would still implore those reading these posts and voting on AWDW to understand the proposition on which they are voting is a specific proposition and they ought to vote as such.

I was originally planning to write a post for AWDW arguing that LLM-based AI systems do not deserve moral consideration on the grounds that they are ‘counterfeit people’ as the late Daniel Dennett argued. This was dropped due to time constraints, but also because I believe that the onus is on the proponents of AI Welfare to make a much more compelling case for enacting such as significant shift in EA priorities.

The justifications that are provided seem rather weak

In the posts that I interpreted as arguing in favour of the AWDW proposition, I saw many framings arguing that it is possible that AI Welfare could be a highly important topic. For example, the following claims directly come from posts written for AWDW:

These are interesting and broad philosophical claims, but to be action guiding they require justification and grounding[5] and I expected the authors to provide robust evidence and grounding for these claims. For example, we may have decades or less, but we may not. AIs could soon have an outsized impact, but they could not. The future may consist of mostly digital minds, but it could not.

It is not good enough to simply say that an issue might have a large scale impact and therefore think it should be an EA priority, it is not good enough to simply defer to Carl Shulman’s views if you yourself can’t argue why you think it’s “pretty likely… that there will be vast numbers of AIs that are smarter than us” and why those AIs deserve moral consideration.

Unfortunately, I could find very little in the posts to actually argue that these possibilities hold, and to me this is simply not enough grounding to take ~450 people and ~$49M away from other causes that are not in the ‘may’ stage, but are actually doing good in issues that effect moral patients right now, and that we can be much more confident in the tractability of our interventions and set up feedback loops to investigate if what we are doing actually is working.

In two cases, Not understanding sentience is a significant x-risk and Conscious AI concerns all of us, the authors present a 2x2 box to frame the debate:[6]

As George Box said: “All models are wrong, but some are useful”. But clearly, not all models are useful. You still need to argue how and why it is useful and should be used to guide decision-making. In neither of the two cases, from what I can tell, do the authors attempt to understand which of the boxes we are in, and in the latter case make an appeal to the precautionary principle.

But one could make precautionary appeals to almost any hypothetical proposition, and for AI Welfare to get the amount of funding and talent the proposition implies, stronger arguments are needed. Without the advocates for AI Welfare making a stronger case that current or near-future AI systems deserve enough moral consideration to be an EA priority, I will continue to find myself unmoved.

Working on AI Welfare can still be valuable and interesting

I’ve worded this essay pretty harshly, so I want to end with a tone of reconciliation and to clarify my point of view:

  • I do think that AI Welfare is an interesting topic to explore.

  • I think the surrounding research into questions of consciousness is exciting and fundamentally connected to many other important questions EA cares about.

  • If digital beings were to exist and be morally relevant, then these questions would need to be answered.[7]

  • Those who find these questions deeply interesting and motivating should still work on them.

  • It is possible that these questions deserve more funding and talent than they are currently getting both within and outside of EA.

Nevertheless, the fact that a topic is interesting and potentially valuable is not enough to make it an EA priority, especially in terms of the question that AWDW has placed before us.

Conclusion: A Win by Default?

The tl;dr summary of the post is this:

  • AWDW’s proposition to make AI welfare an EA priority means allocating 5% of EA talent and funding to this cause area.

  • This would require significant changes to EA funding allocation and project support, and would entail significant counterfactual costs and trade-offs.

  • To make this change, therefore, requires strong justification and confidence about the value[8] of AI Welfare work.

  • Such justification has not than has been provided during AWDW.

  • Therefore, Forum voters ought to vote toward the left end of the slider until such justifications are provided.

  • This does not mean such justifications do not exist or cannot be provided, only that they have not on the Forum during AWDW.

  • While AI welfare should not be an EA Priority at the moment, as defined by AWDW, it could still be interesting and valuable work.

  1. ^

    Though I think I’ve edited out the worst parts sufficiently now.

  2. ^

    Self-identification doesn’t seem to track what we care about here.

  3. ^

    The zero-sum nature falls out of my highly practical interpretation of the AWDW question. Maybe you’re defining talent and funding more maleably, in which case I’d like to see that explicitly argued. If you didn’t read the clarifications given by the Forum team and are just voting based on ‘vibes’ then I think you failed to understand the assignment.

  4. ^

    Assuming the old ‘$5,000’ to save a life through bednets is roughly accurate, this may be a recurring cost of up to 10,800 childrens’ lives per year.

  5. ^

    By which I mean it is not enough to argue that p->q entials q, where q is some course of action you want to take. You need to argue that both p->q holds and that p is the case, which is what I think has been substantially missing from posts this week.

  6. ^

    Now, in the former article, they point out that this is isomorphic to Pascal’s Wager, which to me should be a consideration that something has potentially gone wrong in the reasoning process.

  7. ^

    I’m most sceptical that we can do anything tractable about this at the moment, though.

  8. ^

    Both scale, tractability, and neglectedness