PSIRPEHTA refers to the aggregate ordinary revealed preferences of individual actors, who the AIs will be aligned to, in order to make those humans richer i.e. their preferences as revealed by their actions, such as what they spend their income on, NOT what they think is âmorally correctâ. For example, according to âhuman valuesâ it might be wrong to eat meat, because maybe if humans reflected long enough theyâd express the conclusion that itâs wrong to hurt animals. But from the perspective of PSIRPEHTA, eating meat is generally acceptable, and empirically thereâs little pressure for people to âreflectâ on their values and change them.
EDIT: I guess Iâd think of human values as what people would actually just sincerely and directly endorse without further influencing them first (although maybe just asking them makes them take a position if they didnât have one before, e.g. if theyâve never thought much about the ethics of eating meat).
I think youâre overstating the differences between revealed and endorsed preferences, including moral/âhuman values, here. Probably only a small share of the population thinks eating meat is wrong or bad, and most probably think itâs okay. Even if people generally would find it wrong or bad after reflecting long enough (Iâm not sure they actually would), that doesnât reflect their actual values now. Actual human values do not generally find eating meat wrong.
To be clear, you can still complain that humansâ actual/âendorsed values are also far from ideal and maybe not worth aligning with, e.g. because people donât care enough about nonhuman animals or helping others. Do people care more about animals and helping others than an unaligned AI would, in expectation, though? Honestly, Iâm not entirely sure. Humans may care about animal welfare somewhat, but they also specifically want to exploit animals in large part because of their values, specifically food-related taste, culture, traditions and habit. Maybe people will also want to specifically exploit artificial moral patients for their own entertainment, curiosity or scientific research on them, not just because the artificial moral patients are generically useful, e.g. for acquiring resources and power and enacting preferences (which an unaligned AI could be prone to).
I illustrate some other examples here on the influence of human moral values on companies. This is all of course revealed preferences, but my point is that revealed preferences can importantly reflect endorsed moral values.
People influence companies in part on the basis of what they think is right through demand, boycotts, law, regulation and other political pressure.
Companies, for the most part, canât just go around directly murdering people (companies can still harm people, e.g. through misinformation on the health risks of their products, or because people donât care enough about the harms). (Maybe this is largely for selfish reasons; people donât want to be killed themselves, and thereâs a slippery slope if you allow exceptions.)
GPT has content policies that reflect peopleâs political/âmoral views. Social media companies have use and content policies and have kicked off various users for harassment, racism, or other things that are politically unpopular, at least among a large share of users or advertisers (which also reflect consumers). This seems pretty standard.
Many companies have boycotted Russia since the invasion of Ukraine. Many companies have also committed to sourcing only cage-free eggs after corporate outreach and campaigns, despite cage-free egg consumption being low.
X (Twitter)âs policies on hate speech have changed under Musk, presumably primarily because of his views. That seems to have cost X users and advertisers, but X is still around and popular, so it also shows that some potentially important decisions about how a technology is used are largely in the hands of the company and its leadership, not just driven by profit.
Iâd likewise guess it actually makes a difference that the biggest AI labs are (I would assume) led and staffed primarily by liberals. They can push their own views onto their AI even at the cost of some profit and market share. And some things may have minimal near term consequences for demand or profit, but could be important for the far future. If the company decides to make their AI object more to various forms of mistreatment of animals or artificial consciousness, will this really cost them tons of profit and market share? And it could depend on the markets itâs primarily used in, e.g. this would matter even less for an AI that brings in profit primarily through trading stocks.
Itâs also often hard to say how much something affects a companyâs profits.
EDIT: I guess Iâd think of human values as what people would actually just sincerely and directly endorse without further influencing them first (although maybe just asking them makes them take a position if they didnât have one before, e.g. if theyâve never thought much about the ethics of eating meat).
I think youâre overstating the differences between revealed and endorsed preferences, including moral/âhuman values, here.Probably only a small share of the population thinks eating meat is wrong or bad, and most probably think itâs okay. Even if people generally would find it wrong or bad after reflecting long enough (Iâm not sure they actually would), that doesnât reflect their actual values now. Actual human values do not generally find eating meat wrong.To be clear, you can still complain that humansâ actual/âendorsed values are also far from ideal and maybe not worth aligning with, e.g. because people donât care enough about nonhuman animals or helping others. Do people care more about animals and helping others than an unaligned AI would, in expectation, though? Honestly, Iâm not entirely sure. Humans may care about animal welfare somewhat, but they also specifically want to exploit animals in large part because of their values, specifically food-related taste, culture, traditions and habit. Maybe people will also want to specifically exploit artificial moral patients for their own entertainment, curiosity or scientific research on them, not just because the artificial moral patients are generically useful, e.g. for acquiring resources and power and enacting preferences (which an unaligned AI could be prone to).
I illustrate some other examples here on the influence of human moral values on companies. This is all of course revealed preferences, but my point is that revealed preferences can importantly reflect endorsed moral values.
People influence companies in part on the basis of what they think is right through demand, boycotts, law, regulation and other political pressure.
Companies, for the most part, canât just go around directly murdering people (companies can still harm people, e.g. through misinformation on the health risks of their products, or because people donât care enough about the harms). (Maybe this is largely for selfish reasons; people donât want to be killed themselves, and thereâs a slippery slope if you allow exceptions.)
GPT has content policies that reflect peopleâs political/âmoral views. Social media companies have use and content policies and have kicked off various users for harassment, racism, or other things that are politically unpopular, at least among a large share of users or advertisers (which also reflect consumers). This seems pretty standard.
Many companies have boycotted Russia since the invasion of Ukraine. Many companies have also committed to sourcing only cage-free eggs after corporate outreach and campaigns, despite cage-free egg consumption being low.
X (Twitter)âs policies on hate speech have changed under Musk, presumably primarily because of his views. That seems to have cost X users and advertisers, but X is still around and popular, so it also shows that some potentially important decisions about how a technology is used are largely in the hands of the company and its leadership, not just driven by profit.
Iâd likewise guess it actually makes a difference that the biggest AI labs are (I would assume) led and staffed primarily by liberals. They can push their own views onto their AI even at the cost of some profit and market share. And some things may have minimal near term consequences for demand or profit, but could be important for the far future. If the company decides to make their AI object more to various forms of mistreatment of animals or artificial consciousness, will this really cost them tons of profit and market share? And it could depend on the markets itâs primarily used in, e.g. this would matter even less for an AI that brings in profit primarily through trading stocks.
Itâs also often hard to say how much something affects a companyâs profits.