Thank you a lot for these kind words, Mr. JP Addison!
I would like some EAs reading this comment to give some feedback on an idea I had regarding AI alignment that I’m really unsure about, if they’re willing to. The idea goes like this:
We want to make AIs that have human values.
Human values ultimately come from our basic emotions and affects. These affects come from brain regions older than the neocortex. (read Jaak Pankseep’s work and affective neuroscience for more on that)
So, if we want AIs with human values, we want AIs that at least have emotions and affects or something resembling them.
We could, in principle, make such AIs by developing neural networks that work similarly to our brains, especially regarding those regions that are older than the neocortex.
If you think this idea is ridiculous and doesn’t make any sense, please say so, even in a short comment.
Welcome to EA! I hope you will find it a welcoming and inspiring community.
I dont think the idea is ridiculous at all! However, I am not certain 2. and 3. are true. It is unclear whether all our human values come from our basic emotions and affects (this would seem to exclude the possibility of value learning at the fundamental level; I take this to be still an open debate, and know people doing research on this). It is also unclear if the only way of guaranteeing human values in artificial agents is via emotions and affects or something resembling them, even if it may be one way to do so.
Panksepp did talk about the importance of learning and cognition for human affects. For example, pure RAGE is a negative emotion from which we seek to agressively defend ourselves from noone in particular. Anger is learned RAGE and we are angry about something or someone in particular. And then there are various resentments and hatreds that are more subdued and subtle and which we harbor with our thoughts. Something similar goes for the other 6 basic emotions.
Unfortunately, it seems like we don’t know that much about how affects work. If I understand you correctly, you said that some of our values have little to no connection to our basic affects (be they emotional or otherwise). I thought that all our values are affective because values tell us what is good or bad and affects also tell us what is good or bad (i.e. values and affects have valence), and that affects seem to “come” from older brain regions compared to the values we think and talk about. So I thought that we first have affects (i.e. pain is bad for me and for the people I care about) and then we think about those affects so much that we start to have values (i.e. suffering is bad for anyone who has it). But I could be wrong. Maybe affects and values aren’t always good or bad and that their difference may lie in more than how cognitively elaborated they are. I’d like to know more about what you meant by “value learning at the fundamental level”.
That is interesting. I am not very familiar with Panksepp’s work. That being said, I’d be surprised if his model ( _these specific basic emotions_ ; these specific interactions of affect and emotion) were the only plausible option in current cogsci/psych/neuroscience.
Re “all values are affective”, I am not sure I understand you correctly. There is a sense in which we use value in ethics (e.g. Not helping while persons are starving faraway goes against my values), and a sense in which we use it in psychology (e.g. in a reinforcement learning paradigm). The connection between value and affect may be clearer for the latter than the former. As an illustration, I do get a ton of good feelings out of giving a homeless person some money, so I clearly value it. I get much less of a good feeling out of donating to AMF, so in a sense, I value it less. But in the ethical sense, I value it more—and this is why I give more money to AMF than to homeless persons. You claim that all such ethical sense values ultimately stem from affect, but I think that is implausible—look at e.g. Kantian ethics or Virtue ethics, both of which use principles that are not rooted in affect as their basis.
Re: value learning at the fundamental level, it strikes me as a non obvious question whether we are “born” with all the basic valenced states, and everything else is just learning history of how states in the world affected basic valenced states before; or whether there are valenced states that only get unlocked/learned/experienced later. Having a child is sometimes used as an example—maybe that is just tapping into existing kinds of valenced states, but maybe all those hormones flooding your brain do actually change something in a way that could not be experienced before.
Either way, I do think it may make sense to play around with the idea more!
In other words, there seem to be values that are more related to executive functions (i.e. self-control) than affective states that feel good or bad? That seems like a plausible possibility.
There was a personality scale called ANPS (Affective Neuroscience Personality Scale) that was correlated with the Big Five personality traits. They found that conscienciousness wasn’t correlated with any of the six affects of the ANPS, while the other traits in the Big Five were correlated with at least one of the traits in the ANPS. So conscienciousness seems related to what you talk about (values that don’t come from affects). But at the same time, there was research about how much conscientious people are prone to experience guilt. They found that conscientiousness was positively correlated to how prone to guilt one is.
So, it seems that guilt is an experience of responsibility that is different in some way from the affective states that Panksepp talked about. And it’s related to conscientiousness that could be related to the ethical philosophical values you talked about and the executive functions.
Hm, I wonder if AIs should have something akin to guilt. That may lead to AI sentience, or it may not.
Bibliography
Uncovering the affective core of conscientiousness: the role of self-conscious emotions
Jennifer V Fayard et al., J Pers., 2012 Feb.
https://pubmed.ncbi.nlm.nih.gov/21241309/
Edit: I must say, I’m embarrassed by how much these comments of mine go by the “This makes intuitive sense!” logic, instead of doing rigurous reviews of scientific studies. I’m embarrassed by how my comments have such a low epistemic status. But I’m glad that at least some EA found this idea interesting.
Re edit, you should definitely not feel embarrassed. A forum comment will often be a mix of a few sources and intuition rather than a rigorous review of all available studies. I don’t think this must hold low epistemic status, especially for the purpose of the idea being exploration, rather than, say, a call for funding or such (which would require a higher standard of evidence). Not all EA discussions are literature reviews, otherwise chatting would be so cumbersome! I’d recommend using your studies to explore these and other ideas! Undergraduate studies are a wonderful time to soak up a ton of knowledge, and I look fondly upon mine—I hope you’ll have a similarly inspiring experience. Feel free to shoot me a pm if you ever want to discuss stuff.
Thank you a lot for these kind words, Mr. JP Addison!
I would like some EAs reading this comment to give some feedback on an idea I had regarding AI alignment that I’m really unsure about, if they’re willing to. The idea goes like this:
We want to make AIs that have human values.
Human values ultimately come from our basic emotions and affects. These affects come from brain regions older than the neocortex. (read Jaak Pankseep’s work and affective neuroscience for more on that)
So, if we want AIs with human values, we want AIs that at least have emotions and affects or something resembling them.
We could, in principle, make such AIs by developing neural networks that work similarly to our brains, especially regarding those regions that are older than the neocortex.
If you think this idea is ridiculous and doesn’t make any sense, please say so, even in a short comment.
Welcome to EA! I hope you will find it a welcoming and inspiring community.
I dont think the idea is ridiculous at all! However, I am not certain 2. and 3. are true. It is unclear whether all our human values come from our basic emotions and affects (this would seem to exclude the possibility of value learning at the fundamental level; I take this to be still an open debate, and know people doing research on this). It is also unclear if the only way of guaranteeing human values in artificial agents is via emotions and affects or something resembling them, even if it may be one way to do so.
Oh my goodness, thanks for your comment!
Panksepp did talk about the importance of learning and cognition for human affects. For example, pure RAGE is a negative emotion from which we seek to agressively defend ourselves from noone in particular. Anger is learned RAGE and we are angry about something or someone in particular. And then there are various resentments and hatreds that are more subdued and subtle and which we harbor with our thoughts. Something similar goes for the other 6 basic emotions.
Unfortunately, it seems like we don’t know that much about how affects work. If I understand you correctly, you said that some of our values have little to no connection to our basic affects (be they emotional or otherwise). I thought that all our values are affective because values tell us what is good or bad and affects also tell us what is good or bad (i.e. values and affects have valence), and that affects seem to “come” from older brain regions compared to the values we think and talk about. So I thought that we first have affects (i.e. pain is bad for me and for the people I care about) and then we think about those affects so much that we start to have values (i.e. suffering is bad for anyone who has it). But I could be wrong. Maybe affects and values aren’t always good or bad and that their difference may lie in more than how cognitively elaborated they are. I’d like to know more about what you meant by “value learning at the fundamental level”.
That is interesting. I am not very familiar with Panksepp’s work. That being said, I’d be surprised if his model ( _these specific basic emotions_ ; these specific interactions of affect and emotion) were the only plausible option in current cogsci/psych/neuroscience.
Re “all values are affective”, I am not sure I understand you correctly. There is a sense in which we use value in ethics (e.g. Not helping while persons are starving faraway goes against my values), and a sense in which we use it in psychology (e.g. in a reinforcement learning paradigm). The connection between value and affect may be clearer for the latter than the former. As an illustration, I do get a ton of good feelings out of giving a homeless person some money, so I clearly value it. I get much less of a good feeling out of donating to AMF, so in a sense, I value it less. But in the ethical sense, I value it more—and this is why I give more money to AMF than to homeless persons. You claim that all such ethical sense values ultimately stem from affect, but I think that is implausible—look at e.g. Kantian ethics or Virtue ethics, both of which use principles that are not rooted in affect as their basis.
Re: value learning at the fundamental level, it strikes me as a non obvious question whether we are “born” with all the basic valenced states, and everything else is just learning history of how states in the world affected basic valenced states before; or whether there are valenced states that only get unlocked/learned/experienced later. Having a child is sometimes used as an example—maybe that is just tapping into existing kinds of valenced states, but maybe all those hormones flooding your brain do actually change something in a way that could not be experienced before.
Either way, I do think it may make sense to play around with the idea more!
Thanks for commenting!
In other words, there seem to be values that are more related to executive functions (i.e. self-control) than affective states that feel good or bad? That seems like a plausible possibility.
There was a personality scale called ANPS (Affective Neuroscience Personality Scale) that was correlated with the Big Five personality traits. They found that conscienciousness wasn’t correlated with any of the six affects of the ANPS, while the other traits in the Big Five were correlated with at least one of the traits in the ANPS. So conscienciousness seems related to what you talk about (values that don’t come from affects). But at the same time, there was research about how much conscientious people are prone to experience guilt. They found that conscientiousness was positively correlated to how prone to guilt one is.
So, it seems that guilt is an experience of responsibility that is different in some way from the affective states that Panksepp talked about. And it’s related to conscientiousness that could be related to the ethical philosophical values you talked about and the executive functions.
Hm, I wonder if AIs should have something akin to guilt. That may lead to AI sentience, or it may not.
Bibliography Uncovering the affective core of conscientiousness: the role of self-conscious emotions Jennifer V Fayard et al., J Pers., 2012 Feb. https://pubmed.ncbi.nlm.nih.gov/21241309/
A brief form of the Affective Neuroscience Personality Scales, Frederick S Barrett et al., Psychol Assess., 2013 Sep. https://pubmed.ncbi.nlm.nih.gov/23647046/
Edit: I must say, I’m embarrassed by how much these comments of mine go by the “This makes intuitive sense!” logic, instead of doing rigurous reviews of scientific studies. I’m embarrassed by how my comments have such a low epistemic status. But I’m glad that at least some EA found this idea interesting.
Re edit, you should definitely not feel embarrassed. A forum comment will often be a mix of a few sources and intuition rather than a rigorous review of all available studies. I don’t think this must hold low epistemic status, especially for the purpose of the idea being exploration, rather than, say, a call for funding or such (which would require a higher standard of evidence). Not all EA discussions are literature reviews, otherwise chatting would be so cumbersome!
I’d recommend using your studies to explore these and other ideas! Undergraduate studies are a wonderful time to soak up a ton of knowledge, and I look fondly upon mine—I hope you’ll have a similarly inspiring experience. Feel free to shoot me a pm if you ever want to discuss stuff.