Derek Shiller

Karma: 2,390

Derek Shiller Nov 20, 2024, 6:33 PM
8 points
2 ∶ 0
in reply to: Mikolaj Kniejski’s comment on: LLMs are weirder than you think
I appreciate the pushback on these claims, but I want to flag that you seem to be reading too much into the post. The arguments that I provide aren’t intended to support the conclusion that we shouldn’t treat “I feel pain” as a genuine indicator or that there definitively aren’t coherent persons involved in chatbot text production. Rather, I think people tend to think of their interactions with chatbots in the way they interact with other people, and there are substantial differences that are worth pointing out. I point out four differences. These differences are relevant to assessing personhood, but I don’t claim any particular thing I say has any straightforward bearing on such assessments. Rather, I think it is important to be mindful of these differences when you evaluate LLMs for personhood and moral status. These considerations will affect how you should read different pieces of evidence. A good example of this is the discussion of the studies in the self-identification section. Should you take the trouble LLMs have with counting tokens as evidence that they can’t introspect? No, I don’t think it provides particularly good evidence, because it relies on the assumption that LLMs self-identify with the AI assistant in the dialogue and it is very hard to independently tell whether they do.
Firstly, this claim isn’t accurate. If you provide an LLM with the transcript of a conversation, it can often identify which parts are its responses and which parts are user inputs. This is an empirically testable claim. Moreover, statements about how LLMs process text don’t necessarily negate the possibility of them being coherent personas. For instance, it’s conceivable that an LLM could function exactly as described and still be a coherent persona.
I take it that you mean that LLMs can distinguish their text from others, presumably on the basis of statistical trends, so they can recognize text that reads like the text they would produce? This seems fully in line with what I say: what is important is that LLMs don’t make any internal computational distinction in processing text they are reading and text they are producing. The model functions as a mapping from inputs to outputs, and the mapping changes solely based on words and not their source. If you feed them text that is like the text they would produce, they can’t tell whether or not they produced it. This is very different from the experience of a human conversational partner, who can tell the difference between being spoken to and speaking and doesn’t need to rely on distinguishing whether words sound like something they might say. More importantly, they don’t know in the moment they are processing a given token whether they are in the middle of reading a block of user-supplied text or providing additional text through autoregressive text generation.

LLMs are weirder than you think

Derek ShillerNov 20, 2024, 1:39 PM

61 points

3 comments22 min readEA link

Resource Allocation: A Research Agenda

arvommNov 14, 2024, 3:34 PM

44 points

7 comments33 min readEA link

The Welfare of Digital Minds: A Research Agenda

Derek ShillerNov 11, 2024, 12:58 PM

53 points

1 comment31 min readEA link

Valuing Impacts Across Species: A Research Agenda

Bob FischerNov 4, 2024, 11:55 AM

45 points

1 comment22 min readEA link

Derek Shiller Oct 19, 2024, 8:51 PM
2 points
0 ∶ 0
in reply to: Linch’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project
If some theories see reasons where others do not, they will be given more weight in a maximize-expected-choiceworthiness framework. That seems right to me and not something to be embarrassed about. Insofar as you don’t want to accept the prioritization implications, I think the best way to avoid them is with an alternative approach to making decisions under normative uncertainty.

Bargaining among worldviews

Hayley ClatterbuckOct 18, 2024, 6:32 PM

57 points

5 comments12 min readEA link

Derek Shiller Oct 18, 2024, 12:52 PM
4 points
0 ∶ 0
in reply to: titotal’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project
See, the thing that’s confusing me here is that there are many solutions to the two envelope problem, but none of them say “switching actually is good”.
What I’ve been suggesting is that when looking inside the envelope, it might subsequently make sense to switch depending upon what you see: when assessing human/alien tradeoffs, it might make sense to prefer helping the aliens depending on what it is like to be human. (It follows that it could have turned out that it didn’t make sense to switch given certain human experiences—I take this to play out in the moral weights context with the assumption that given certain counterfactual qualities of human experience, we might have preferred different schemes relating the behavioral/neurological indicators to the levels of welfare.)
This is not at all a rare view among academic discussions, particularly given the assumption that your prior probabilities should not be equally distributed over an infinite number of possibilities about what each of your experiences will be like (which would be absurd in the human/alien case).

Derek Shiller Oct 18, 2024, 12:30 PM
8 points
4 ∶ 0
in reply to: Jeff Kaufman 🔸’s comment on: Explaining the discrepancies in cost effectiveness ratings: A replication and breakdown of RP’s animal welfare cost effectiveness calculations
I would be surprised if most people had stronger views about moral theories than about the upshots for human-animal tradeoffs. I don’t think that most people come to their views about tradeoffs because of what they value, rather they come their views about value because of their views about tradeoffs.

Derek Shiller Oct 17, 2024, 7:17 PM
6 points
0 ∶ 1
in reply to: titotal’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project

Clearly, this reasoning is wrong. The cases of the alien and human are entirely symmetric: both should realise this and rate each other equally, and just save whoevers closer.

I don’t think it is clearly wrong. You each have separate introspective evidence and you don’t know what the other’s evidence is, so I don’t think you should take each other as being in the same evidential position (I think this is the gist of Michael St. Jules’ comment). Perhaps you think that if they do have 10N neurons, then the depth and quality of their internal experiences, combined with whatever caused you to assign that possibility a 25% chance, should lead them to assign that hypothesis a higher probability. You need not think that they are responding correctly to their introspective evidence just because they came to a symmetric conclusion. Maybe the fact that they came to a symmetric conclusion is good evidence that you actually have the same neuron count.

Your proposal of treating them equally is also super weird. Suppose that I offer you a bet with a 25% chance of a payout of $0.1, a 50% chance of $1, and a 25% chance of $10. It costs $1. Do you accept? Now I say, I will make the payout (in dollars) dependent on whether humans or aliens have more neurons. Your credences haven’t changed. Do you change your mind about the attractiveness of this monetary bet? What if I raise the costs and payout to amounts of money on the scale of a human life? What if I make the payout be constituted by saving one random alien life and the cost be the amount of money equal to a human life? What if the costs and payouts are alien and human lives? If you want to say that you should think the human and alien life are equally valuable in expectation, despite the ground facts about probabilities of neuron counts and assumed valuation schema, you’re going to have to say something uncomfortable at some point about when your expected values come apart from probabilities of utilities.

Derek Shiller Oct 16, 2024, 4:33 PM
10 points
0 ∶ 0
in reply to: NickLaing’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project

NB: (side note, not biggerst deal) I would personally appreciate it if this kind of post could somehow be written in a way that was slightly easier to understand for those of us who non moral philosophers, using less Jargon and more straightforward sentences. Maybe this isn’t possible though and I appreciate it might not be worth the effort simplifying things for the plebs at times ;).

Noted, I will keep this in mind going forward.

Derek Shiller Oct 16, 2024, 4:33 PM
8 points
0 ∶ 0
in reply to: Lukas Finnveden’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project

The alien will use the same reasoning and conclude that humans are more valuable (in expectation) than aliens. That’s weird.

Granted, it is a bit weird.

At this point they have no evidence about what either human or alien experience is like, so they ought to be indifferent between switching or not. So they could be convinced to switch to benefitting humans for a penny. Then they will go have experiences, and regardless of what they experience, if they then choose to “pin” the EV-calculation to their own experience, the EV of switching to benefitting non-humans will be positive. So they’ll pay 2 pennies to switch back again. So they 100% predictably lost a penny. This is irrational.

I think it is helpful to work this argument out within a Bayesian framework. Doing so will require thinking in some ways that I’m not completely comfortable with (e.g. having a prior over how much pain hurts for humans), but I think formal regimentation reveals aspects of the situation that make the conclusion easier to swallow.

In order to represent yourself as learning how good human experiences are and incorporating that information into your evidence, you will need to assign priors that allow for each possible value human experiences might have. You will also need to have priors for each possible value alien experiences might have. To make your predictable loss argument go through, you will still need to treat alien experiences as either half as good or twice as good with equal probabilities no matter how good human experiences turn out to be. (Otherwise, your predictable loss argument needs to account for what the particular experience you feel tells you about the probabilities that the alien’s experiences are higher or lower, this can give you evidence that contradicts the assumption that the alien’s value is equally likely to be half or twice.) This isn’t straightforwardly easy. If you think that human experience might be either worth N or N/2 and you think alien experience might be either N/2 or N, then learning that human experience is N will tell you that the alien experience is worth N/2.

There are a few ways to set up the priors to get the conclusion that you should favor the alien after learning how good human experience is (no matter how good that is). One way is to assume off the bat that aliens are likely to have a higher probability of higher experiential values. Suppose, to simplify things a bit, you thought that the highest value of experience an human could have is N. (More realistically, the values should trail off with ever lower probabilities, but the basic point I’m making would still go through—alien’s possible experience values couldn’t decline at the same rate as humans without violating the equal probability constraint.) Then, to allow that you could still infer that alien experience is as likely to be twice as good as any value you could discover, the highest value an alien could have would have to be 2*N. It makes sense given these priors that you should give preference to the alien even before learning how good your experiences are: your priors are asymmetric and favor them.

Alternatively, we can make the logic work by assigning a 0 probability to every possible value of human experience (and a 0 to every possible value of alien experience.) This allows that you could discover that human experience had any level of value, and, conditional on however good that was, the alien was likely to have half or twice as good experiences. However, this prior means that in learning what human experience is like, you will learn something to which you previously assigned a probability of 0. Learning propositions to which you assigned a 0 is notoriously problematic and will lead to predictable losses if you try to maximize expected utility for reasons completely separate from the two envelopes problem.

Derek Shiller Oct 16, 2024, 2:49 PM
2 points
0 ∶ 1
in reply to: titotal’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project
I think you should make the conversion because you know what human experience is like. You don’t know what elephant or alien experience is like. Elephants or aliens may make different choices than you do, but they are responding to different evidence than you have, so that isn’t that weird.

Derek Shiller Oct 15, 2024, 9:45 PM
7 points
0 ∶ 0
in reply to: NickLaing’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project
there are different moral theories at play, it gets challenging. I agree with Tomasik that there may sometimes be no way to make a comparison or extract anything like an expected utility.
What matters, I think, in this case, is whether the units are fixed across scenarios. Suppose that we think one unit of value corresponds to a specific amount of human pain and that our non-hedonist theory cares about pain just as much as our hedonistic theory, but also cares about other things in addition. Suppose that it assigns value to personal flourishing, such that it sees 1000x value from personal flourishing as pain mitigation coming from the global health intervention and thinks non-human animals are completely incapable of flourishing. Then we might represent the possibilities as such:
Animal Global Health
Hedonism 500 1
Hedonism + Flourishing 500 1000
If we are ⁵⁰⁄₅₀, then we should slightly favor the global health intervention, given its expected value of 500.5. This presentation requires that the hedonism + flourishing view count suffering just as much as the hedonist view. So unlike in the quote, it doesn’t down weight the pain suffered by animals in the non-hedonist case. The units can be assumed to be held fixed across contexts.
If we didn’t want to make that assumption, we could try to find a third unit that was held fixed that we could use as a common currency. Maybe we could bring in other views to act as an intermediary. Absent such a common currency, I think extracting an expected value gets very difficult and I’m not sure what to say.
Requiring a fixed unit for comparisons isn’t so much of a drawback as it might seem. I think that most of the views people actually hold care about human suffering for approximately the same reasons, and that is enough license to treat it as having approximately the same value. To make the kind of case sketched above concrete, you’d have to come to grips with how much more valuable you think flourishing is than freedom from suffering. One of the assumptions that motivated the reductive presuppositions of the Moral Weight Project was that suffering is one of the principal components of value for most people, so that it is unlikely to be vastly outweighed by the other things people care about.

Derek Shiller Oct 15, 2024, 6:53 PM
7 points
1 ∶ 0
in reply to: Vasco Grilo🔸’s comment on: The Moral Two Envelopes Problem and the Moral Weights Project
It is an intriguing use of a geometric mean, but I don’t think it is right because I think there is no right way to do it given just the information you have specified. (The geometric mean may be better as a heuristic than the naive approach—I’d have to look at it in a range of cases—but I don’t think it is right.)
The section on Ratio Incorporation goes into more detail on this. The basic issue is that we could arrive at a given ratio either by raising or lowering the measure of each of the related quantities and the way you get to a given ratio matters for how it should be included in expected values. In order to know how to find the expected ratio, at least in the sense you want for consequentialist theorizing, you need to look at the details behind the ratios.

The Moral Two Envelopes Problem and the Moral Weights Project

Derek ShillerOct 15, 2024, 6:12 PM

92 points

20 comments14 min readEA link

Derek Shiller Oct 15, 2024, 5:04 PM
24 points
4 ∶ 0
on: Explaining the discrepancies in cost effectiveness ratings: A replication and breakdown of RP’s animal welfare cost effectiveness calculations
Thanks for this detailed presentation. I think it serves as a helpful, clear, and straightforward introduction to the models and uncovers aspects of the original model that might be unintuitive and open to question. I’ll note that the model was originally written by Laura Duffy and she has since left Rethink Priorities. I’ve reached out to her in case she wishes to jump in, but I’ll provide my own thoughts here.
1.) You note that we use different lifespan estimates for caged and cage-free hens from the welfare footprint. The reasons for this difference are explained here. However, you are right that though we attribute longer lives for caged hens – on the assumption that they are more often molted to extend productivity – we don’t adjust the hours-spent-suffering of caged hens, and that the diluted suffering of caged hens leads to a less effective verdict in the model.
I see three choices one could have made here: discard our lifespan assumptions, try to modify the welfare footprint hours-spent-suffering inputs, or keep the welfare footprint inputs paired with our longer lifespans. The final option is in some sense a more conservative choice and is the one we went with (but I can’t say whether it was an oversight or a deliberate choice).
Your alternative approach of using the welfare footprint numbers for both hours spent suffering and lifespan estimates seems sensible to me and would be less conservative.
2.) I believe some of the differences in your approach and ours may be explained by our desire to account for differences in productivity between hens in each environment. Our model includes estimates of eggs per chicken and assumes there need to be more cage-free hens to produce the same number of eggs. By lobbying for cage-free systems, you also increase the number of chickens confined in farms. This is accounted for in the variable Ratio CF/CC Hens, which we estimate to be 1.05. Including this further reduces the efficacy of cage-free campaigns because transitioning will increase the number of total hens.
What links here?
- AGB 🔸's comment on Multiplier Arguments are often flawed by AGB 🔸 (Oct 17, 2024, 8:40 AM; 43 points)

Derek Shiller Oct 11, 2024, 5:59 PM
15 points
0 ∶ 0
in reply to: titotal’s comment on: What do RP’s tools tell us about giving $100m to AW or GHD?

Before I continue, I want to thank you for being patient and working with me on this. I think people are making decisions based on these figures so it’s important to be able to replicate them.

I appreciate that you’re taking a close look at this and not just taking our word for it. It isn’t inconceivable that we made an error somewhere in the model, and if no one pays close attention it would never get fixed. Nevertheless, it seems to me like we’re making progress toward getting the same results.

Total DALYs averted:

4.47274/(36524) = 0.14 disabling DALYS averted

0.152259/(36524) = 0.0386 hurtful DALYS averted

0.015* 4645/(365*24) =0.00795 hurtful Dalys averted

Total is about 0.19 DALY’s averted per hen per year.

I take it that the leftmost numbers are the weights for the different pains? If so, the numbers are slightly different from the numbers in the model. I see an average weight of about 6 for disabling pain, 0.16 for hurtful pain, and 0.015 for annoying pain. This works out to ~0.23 in total. Where are your numbers coming from?

Derek Shiller Oct 11, 2024, 1:55 PM
5 points
1 ∶ 0
in reply to: titotal’s comment on: What do RP’s tools tell us about giving $100m to AW or GHD?

Saulius is saying that each dollar affects 54 chicken years of life, equivalent to moving 54 chickens from caged to cage free environments for a year. The DALY conversion is saying that, in that year, each chicken will be 0.23 DALY’s better off. So in total, 54*0.23 = 12.43

I don’t believe Saulius’s numbers are directly used at any point in the model or intended to be used. The model replicates some of the work to get to those numbers. That said, I do think that you can use your approach to validate the model. I think the key discrepancy here is that the 0.23 DALY figure isn’t a figure per bird/year, but per year. The model also assumes that ~2.18 birds are affected per dollar. The parameter you would want to multiply by Saulius’s estimate is the difference between Annual CC Dalys/bird/year and Annual CF Dalys/bird/year, which is ~0.1. If you multiply that through, you get about ~1000 DALYs/thousand dollars. This is still not exactly the number Laura arrives at via her Monte Carlo methods and not exactly the estimate in the CCM, but due to the small differences in parameters, model structure, and computational approaches, this difference is in line with what I would expect.

Derek Shiller Oct 10, 2024, 4:49 PM
12 points
3 ∶ 0
in reply to: titotal’s comment on: What do RP’s tools tell us about giving $100m to AW or GHD?

If I take sallius’s median result of 54 chicken years life affected per dollar, and then multiply by Laura’s conversion number of 0.23 DALYs per $ per year, I get a result of 12.4 chicken years life affected per dollar. If I convert to DALY’s per thousand dollars, this would result in a number of 12,420.

Laura’s numbers already take into account the number of chickens affected. The 0.23 figure is a total effect to all chickens covered per dollar per year. To get the effect per $1000, we need to multiply by the number of years the effect will last and by 1000. Laura assumes a log normal distribution for the length of the effect that averages to about 14 years. So roughly, 0.23 * 14 * 1000 = 3220 hen DALYs per 1000 dollars.

Note: this is hen DALYs, not human DALYs. To convert to human DALYs we would need to adjust by the suffering capacity and sentience. In Laura’s model (we use slightly different values in the CCM), this would mean cutting the hen DALYs by about 70% and 10%, resulting in about 900 human-equivalent DALYs per 1000 dollars total over the lifespan of the effect. Laura was working in a Monte Carlo framework, whereas the 900 DALY number is derived just from multiplying means, so she arrived at a slightly different value in her report. The CCM also uses slightly different parameter settings for moral weights, but the result it produces still is in the same ballpark.

Derek Shiller

LLMs are weirder than you think

Re­source Allo­ca­tion: A Re­search Agenda

The Welfare of Digi­tal Minds: A Re­search Agenda

Valu­ing Im­pacts Across Species: A Re­search Agenda

Bar­gain­ing among worldviews

The Mo­ral Two En­velopes Prob­lem and the Mo­ral Weights Project

Resource Allocation: A Research Agenda

The Welfare of Digital Minds: A Research Agenda

Valuing Impacts Across Species: A Research Agenda

Bargaining among worldviews

The Moral Two Envelopes Problem and the Moral Weights Project