Arepo comments on Carl Shulman on the moral status of current and future AI systems

Arepo 4 Jul 2024 3:49 UTC
4 points
0 ∶ 4
Notice that Shulman does not say anything about AI consciousness or sentience in making this case. Here and throughout the interview, Shulman de-emphasizes the question of whether AI systems are conscious, in favor of the question of whether they have desires, preferences, interests.
I’m a huge fan of Shulman in general, but on this point I find him quasi-religious. He once sincerely described hedonistic utilitarianism as ‘a doctrine of annihilation’ on the grounds (I assume) that it might advocate tiling the universe with hedonium—ignoring that preference-based theories of value either reach the same conclusions or have a psychopathic disregard for the conscious states sentient entities do have. I’ve written more about why here.
- CarlShulman 6 Jul 2024 17:55 UTC
  10 points
  4 ∶ 1
  Parent
  I have two views in the vicinity. First, there’s a general issue that human moral practice generally isn’t just axiology, but also includes a number of elements that are built around interacting with other people with different axiologies, e.g. different ideologies coexisting in a liberal society, different partially selfish people or family groups coexisting fairly while preferring different outcomes. Most flavors of utilitarianism ignore those elements, and ceteris paribus would, given untrammeled power, call for outcomes that would be ruinous for ~all currently existing beings, and in particular existing societies. That could be classical hedonistic utilitarianism diverting the means of subsistence from all living things as we know them to fuel more hedonium, negative-leaning views wanting to be rid of all living things with any prospects for having or causing pain or dissatisfaction, or playing double-or-nothing with the universe until it is destroyed with probability 1.
  So most people have reason to oppose any form of utilitarianism getting absolute power (and many utilitarianisms would have reason to self-efface into something less scary and dangerous and prone to using power in such ways that would have a better chance of realizing more of what it values by less endangering other concerns). I touch on this in an article with Elliott Thornley.
  
  I have an additional objection to hedonic-only views in particular, in that they don’t even take as inputs many of people’s concerns, and so more easily wind up hostile to particular individuals supposedly for those individuals’ sake. E.g. I would prefer to retain my memories and personal identity, knowledge and autonomy, rather than be coerced into forced administration of pleasure drugs. I also would like to achieve various things in the world in reality, and would prefer that to an experience machine. A normative scheme that doesn’t even take those concerns as inputs is fairly definitely going to run roughshod over them, even if some theories that take them as inputs might do so too.
  What links here?
  - North And's comment on Is RP’s Moral Weights Project too animal friendly? Four critical junctures by NickLaing (11 Oct 2024 14:19 UTC; 6 points)
  - MichaelStJules 7 Jul 2024 16:43 UTC
    3 points
    1 ∶ 0
    Parent
    (You may be aware of these already, but I figured they were worth sharing if not, and for the benefit of other readers.)
    Some “preference-affecting views” do much better on these counts and can still be interpreted as basically utilitarian (although perhaps not based on “axiology” per se, depending on how that’s characterized). In particular:
    Object versions of preference views, as defended in Rabinowicz & Österberg, 1996 and van Weeldon, 2019. These views are concerned with achieving the objects of preferences/desires, essentially taking on everyone’s preferences/desires like moral views weighed against one another. They are not (necessarily) concerned with having satisfied preferences/desires per se, or just having more favourable attitudes (like hedonism and other experientialist views), or even objective/stance-independent measures of “value” across outcomes.^[1]
    The narrow and hard asymmetric view of Thomas, 2019 (for binary choices), applied to preferences/desires instead of whole persons or whole person welfare. In binary choices, if we add a group of preferences/desires and assume no other preference/desire is affected, this asymmetry is indifferent to the addition of the group if their expected total value (summing the value in favourable and disfavourable attitudes) is non-negative, but recommends against it if their expected total value is negative. It is also indifferent between adding one favourable attitude and another even more favourable attitude. Wide views, which treat contingent counterparts as if they’re necessary, lead to replacement.
    Actualism, applied to preferences instead of whole persons or whole person welfare (Hare, 2007, Bykvist, 2007, St. Jules, 2019, Cohen, 2020, Spencer, 2021, for binary choices).
    Dasgupta’s view, or other modifications of the above views in a similar direction, for more than two options to choose from, applied to preferences instead of whole persons or whole person welfare. This can avoid repugnance and replacement in three option cases, as discussed here. (I’m working on other extensions to choices between more than two options.)
    I think, perhaps by far, the least alienating (paternalistic?) moral views are preference-affecting “consequentialist” views, without any baked-in deontological constraints/presumptions, although they can adopt some deontological presumptions from the actual preferences of people with deontological intuitions. For example, many people don’t care (much) more about being killed by another human over dying by natural causes (all else equal), so it would be alienating to treat their murder as (much) worse or worth avoiding (much) more than their death by natural causes on their behalf. But some people do care a lot about such differences, so we can be proportionately sensitive to those differences on their behalf, too. That being said, many preferences can’t be assigned weights or values on the same scale in a way that seems intuitively justified to me, essentially the same problem as intertheoretic comparisons across very different moral views.
    I’m working on some pieces outlining and defending preference-affecting views in more detail.
    ^
    Rabinowicz & Österberg, 1996:
    To the satisfaction and the object interpretations of the preference-based conception of value correspond, we believe, two different ways of viewing utilitarianism: the spectator and the participant models. According to the former, the utilitarian attitude is embodied in an impartial benevolent spectator, who evaluates the situation objectively and from the ‘outside’. An ordinary person can approximate this attitude by detaching himself from his personal engagement in the situation. (Note, however, that, unlike the well-known meta-ethical ideal observer theory, the spectator model expounds a substantive axiological view rather than a theory about the meaning of value terms.) The participant model, on the other hand, puts forward as a utilitarian ideal an attitude of emotional participation in other people’s projects: the situation is to be viewed from ‘within’, not just from my own perspective, but also from the others’ points of view. The participant model assumes that, instead of distancing myself from my particular position in the world, I identify with other subjects: what it recommends is not a detached objectivity but a universalized subjectivity.
    Object vs attitude vs satisfaction/combination versions of preference/desire views are also discussed in Bykvist, 2022 and Lin, 2022, and there’s some other related discussion by Rawls (1982, pdf, p.181) and Arneson (2006, pdf).
- Wei Dai 7 Jul 2024 9:09 UTC
  2 points
  0 ∶ 0
  Parent
  I liked your “Choose your (preference) utilitarianism carefully” series and think you should finish part 3 (unless I just couldn’t find it) and repost it on this forum.
  - Arepo 14 Jul 2024 6:04 UTC
    2 points
    0 ∶ 0
    Parent
    Thanks! I wrote a first draft a few years ago, but I wanted an approach that leaned on intuition as little as possible if at all, and ended up thinking my original idea was untenable. I do have some plans on how to revisit it and would love to do so once I have the bandwidth.
- MichaelStJules 4 Jul 2024 5:00 UTC
  2 points
  0 ∶ 0
  Parent
  ignoring that preference-based theories of value either reach the same conclusions or have a psychopathic disregard for the conscious states sentient entities do have
  I think this is probably not true under some preference-affecting views (basically applying person-affecting views to preferences instead of whole persons) and a fairly wide concept of preference as basically an appearance of something mattering, being bad, good, better or worse (more on such appearances here). Such a wide concept of preference would include pleasure, unpleasantness, aversive desires, appetitive desires, moral intutions, moral views, goals.
  - Arepo 4 Jul 2024 5:54 UTC
    1 point
    0 ∶ 0
    Parent
    Sorry, I should say either that or imply some at-least-equally-dramatic outcome (e.g. favouring immediate human extinction in the case of most person-affecting views). Though I also think there’s convincing interpretations of such views in which they still favour some sort of shockwave, since they would seek to minimise future suffering throughout the universe, not just on this planet.
    more on such appearances here
    I’ll check this out if I ever get around to finishing my essay :) Off the cuff though, I remain immensely sceptical that one could usefully describe ‘preference as basically an appearance of something mattering, being bad, good, better or worse’ in such a way that such preferences could be
    a. detachable from conscious, and
    b. unambiguous in principle, and
    c. grounded in any principle that is universally motivating to sentient life (which I think is the big strength of valence-based theories)
    - MichaelStJules 4 Jul 2024 14:12 UTC
      2 points
      0 ∶ 0
      Parent
      Sorry, I should say either that or imply some at-least-equally-dramatic outcome (e.g. favouring immediate human extinction in the case of most person-affecting views).
      I think this is probably not true, either, or at least not in a similarly objectionable way. There are person-affecting views that would not recommend killing everyone/human extinction for their own sake or to replace them with better off individuals, when the original individuals have on average subjectively “good” lives (even if there are many bad lives among them). I think the narrow and hard asymmetric view by Thomas (2019) basically works in binary choices, although his extension to more than three options doesn’t work (I’m looking at other ways of extending it; I discuss various views and their responses to replacement cases here.)
      Off the cuff though, I remain immensely sceptical that one could usefully describe ‘preference as basically an appearance of something mattering, being bad, good, better or worse’ in such a way that such preferences could be
      a. detachable from conscious, and
      b. unambiguous in principle, and
      c. grounded in any principle that is universally motivating to sentient life (which I think is the big strength of valence-based theories)
      a. I would probably say that preferences as appearances are at least minimal forms of consciousness, rather than detachable from it, under a gradualist view. (I also think hedonic states/valence should probably be understood in gradualist terms, too.)
      b. I suspect preferences can be and tend to be more ambiguous than hedonic states/valence, but hedonic states/valence don’t capture all that motivates, so they miss important ways things can matter to us. I’m also not sure hedonic states/valence are always unambiguous. Ambiguity doesn’t bother me that much. I’d rather ambiguity than discounting whole (apparent) moral patients or whole ways things can (apparently) matter to us.
      c. I think valence-based theories miss how some things can be motivating. I want to count everything and only the things that are “motivating”, suitably defined. Roelofs (2022, ungated)’s explicitly counts all “motivating consciousness”:
      The basic argument for Motivational Sentientism is that if a being has conscious states that motivate its actions, then it voluntarily acts for reasons provided by those states. This means it has reasons to act: subjective reasons, reasons as they appear from its perspective, and reasons which we as moral agents can take on vicariously as reasons for altruistic actions. Indeed, any being that is motivated to act could, it seems, sincerely appeal to us to help it: whatever it is motivated to do or bring about, it seems to make sense for us to empathise with that motivating conscious state, and for the being to ask us to do so if it understands this.
      (...)
      As I am using it, ‘motivation’ here does not mean anything merely causal or functional: motivation is a distinctively subjective, mental, process whereby some prospect seems ‘attractive’, some response ‘makes sense’, some action seems ‘called for’ from a subject’s perspective. The point is not whether a given sort of conscious state does or does not cause some bodily movement, but whether it presents the subject with a subjective reason for acting.
      Or, if we did instead define these appearances functionally/causally in part by their (hypothetical) effects on behaviour (or cognitive control or attention specifically), as I’m inclined to, and define motivation functionally/causally in similar terms, then we could also get universal motivation, by definition. For example, something that appears “bad” would, by definition, tend to lead to its avoidance or prevention. This is all else equal, hypothetical and taking tradeoffs and constraints into account, e.g. if something seems bad to someone, they would avoid or prevent it if they could, but may not if they can’t or have other appearances that motivate more.