Even if we grant that punishment is more effective than positive reward in shaping behavior, what about the consideration that, once the animal learns, it’ll avoid situations where it gets punished, but it actively seeks out (and gets better at) obtaining positive reward?
(I got this argument from Michael St Jules—see point 4. in the list in this comment.)
Edit: And as a possible counterpoint to the premise, I remember this review of a book on parenting and animal training where it says that training animals with attention on positive reward (but also trying not to reward undesired behavior) works best. That’s a different context than evolution’s, though.
For what it’s worth, I agree with the sentence in your linked draft that “[...] not getting a reward may create frustration, which is nothing but another form of pain.”
But overall I’d be pretty hesitant to give much weight to theoretical arguments of this sort, especially since you can often think of counterconsiderations like the one above.
Even if we grant that punishment is more effective than positive reward in shaping behavior, what about the consideration that once the animal learns, it’ll avoid situations where it gets punished, but it actively seeks out (and gets better at) obtaining positive reward?
Fair point, though then:
Any being that is motivated by severe pain where yours is motivated by pleasure (or lighter pain like small frustration, indeed) should be selected for over yours.
Your animal will presumably still need reminders of what it feels like not to avoid these situations to actually be motivated to avoid them. (Unless the suffering it felt the last time was so traumatizing that it’ll never make the mistake again but then, this hardly goes against the suffering-prevalence thesis.)
We know (from empirical findings, this time) that many of those pain-inducing situations are common and hard to (systematically) avoid.
(EDIT to add:) Why would your learning animal need rewards if it can just not repeat past mistakes? Maybe learning abilities say things about how large the welfare range is more than about pain vs. pleasure (see Schukraft et al. 2024, sec. 3.1.4, and 4.4.3).
But overall I’d be pretty hesitant to give much weight to theoretical arguments of this sort, especially since you can sometimes think of counterconsiderations like the one above.
In absolute terms, fair. I’m just skeptical that judgment calls on net welfare after empirically studying the lives of wild animals are any better. If there’s a logical or evolutionary reason to expect X, this seems like a stronger reason for X than “we’ve looked at what some wild animals commonly experience and we feel like what we see means X.”
Maybe stronger does not mean strong in absolute, though. But then, the conclusion would not be that we shouldn’t update much based on theoretical arguments of this sort, but that there is no evidence we can find (whether theoretical or empirical) on which we could base significant updates.
And as a possible counterpoint to the premise, I remember this review of a book on parenting and animal training where it says that training animals with attention on positive reward (but also trying not to reward undesired behavior) works best. That’s a different context than evolution’s, though.
“”[...] not getting a reward may create frustration, which is nothing but another form of pain.” From my human experience, I can be living “net positive” while being extremely frustrated about something.
In general I think direct observation of individuals is a fantastic way forward. Maybe even the only way forward here. Theoretical arguments make so many assumptions I fee llike I could argue all sides here.
I’m amazed EAs haven’t funded some individual animal observation stuff. Put a small cam and a fitbit on a deer or other prey animal and see what they get up to? My guess is that the life would look more positive than we expect.
Thanks!
Playing devil’s advocate:
Even if we grant that punishment is more effective than positive reward in shaping behavior, what about the consideration that, once the animal learns, it’ll avoid situations where it gets punished, but it actively seeks out (and gets better at) obtaining positive reward?
(I got this argument from Michael St Jules—see point 4. in the list in this comment.)
Edit: And as a possible counterpoint to the premise, I remember this review of a book on parenting and animal training where it says that training animals with attention on positive reward (but also trying not to reward undesired behavior) works best. That’s a different context than evolution’s, though.
For what it’s worth, I agree with the sentence in your linked draft that “[...] not getting a reward may create frustration, which is nothing but another form of pain.”
But overall I’d be pretty hesitant to give much weight to theoretical arguments of this sort, especially since you can often think of counterconsiderations like the one above.
Fair point, though then:
Any being that is motivated by severe pain where yours is motivated by pleasure (or lighter pain like small frustration, indeed) should be selected for over yours.
Your animal will presumably still need reminders of what it feels like not to avoid these situations to actually be motivated to avoid them. (Unless the suffering it felt the last time was so traumatizing that it’ll never make the mistake again but then, this hardly goes against the suffering-prevalence thesis.)
We know (from empirical findings, this time) that many of those pain-inducing situations are common and hard to (systematically) avoid.
(EDIT to add:) Why would your learning animal need rewards if it can just not repeat past mistakes? Maybe learning abilities say things about how large the welfare range is more than about pain vs. pleasure (see Schukraft et al. 2024, sec. 3.1.4, and 4.4.3).
In absolute terms, fair. I’m just skeptical that judgment calls on net welfare after empirically studying the lives of wild animals are any better. If there’s a logical or evolutionary reason to expect X, this seems like a stronger reason for X than “we’ve looked at what some wild animals commonly experience and we feel like what we see means X.”
Maybe stronger does not mean strong in absolute, though. But then, the conclusion would not be that we shouldn’t update much based on theoretical arguments of this sort, but that there is no evidence we can find (whether theoretical or empirical) on which we could base significant updates.
Interesting, I’ll look into this. Thanks!
“”[...] not getting a reward may create frustration, which is nothing but another form of pain.” From my human experience, I can be living “net positive” while being extremely frustrated about something.
In general I think direct observation of individuals is a fantastic way forward. Maybe even the only way forward here. Theoretical arguments make so many assumptions I fee llike I could argue all sides here.
I’m amazed EAs haven’t funded some individual animal observation stuff. Put a small cam and a fitbit on a deer or other prey animal and see what they get up to? My guess is that the life would look more positive than we expect.