This post argues that desirable AI qualities include 1) Bayesian inference of its expected impacts on wellbeing across times and moral circles, 2) Finding and actualizing solutions to increase individuals’ wellbeing, 3) Motivating wellbeing focus among other systems that use AI, 4) Finding ‘the most good’ solutions enjoyable under the veil of ignorance using independently developing perspectives, and 5) Differentiating ‘human’ values from ‘inhumane’ goods based on inclusive ethics and morality research. This piece overviews each of these quality categories.
I thank Eleos Citrini for feedback on a draft of this post. All errors are mine.
Notes on level and definition of AI and normative type of alignment
These desirable? qualities relate to artificial intelligence that can outperform a single human or a coordinated group of humans in a specific task; AI agents that work with human agents, regardless of these groups’ relative performance to other ones in any metric; artificial general intelligence with agency that can independently shift the direction of global output; and artificial specific intelligence that functions as a learning step to more powerful machines.
Intelligence is defined as the ability to develop and advance solutions including by quantifying relevant concepts and gathering and processing necessary information.[1]
This is a maximalist framework: seeking to develop a ‘fully aligned’ system that continuously optimizes itself according to an improving understanding of human values rather than seeking to prevent a negative outcome based on some definition of such values.[2]
Desirable AI qualities
Bayesian inference of its expected impacts on wellbeing across times and moral circles. This includes:
Understanding advertisement impacts on viewers’ psychology
Accounting for human welfare impacts of different economic production decisions
Comprehending animal welfare effects of its actions and inactions
Predicting safety of systems that have various effects on individuals and can be developed by its decisions
Updating its understanding of wellbeing with new evidence
Finding and actualizing solutions to increase individuals’ wellbeing. For example, by:
Synthesizing relevant literature and ‘impersonating’ experts
Developing connections with key stakeholders by self-optimization
Gaining financial resources by market analysis
Organizing employees to actualize solutions
Gaining information necessary for intervention optimization
Motivating wellbeing focus among other systems that use AI. This can include:
Presenting success and sharing it at a good price
Detecting possibly harmful algorithm pieces and suggesting alternatives
Recommending purchases of AI with high harm/price ratio and changing or discarding them
Combining wellbeing effects with interests of other systems and sharing the two together
Finding ‘the most good’ solutions enjoyable under the veil of ignorance using independently developing perspectives. This may go beyond wellbeing (which should be fundamental), e. g. estimated by consciousness metrics. For instance:
Understanding fundamental needs of different humans and non-humans and advancing a system that continuously fulfills these needs
Stimulating and coordinating differentiated thinking regarding the most good and enabling a competition among various solutions which would not jeopardize any individuals’ wellbeing or needs fulfillment
Conducting spacetime research to gain further definitions of the most good and adding relevant solutions
Differentiating ‘human’ values from ‘inhumane’ goods based on inclusive ethics and morality research. For example:
Gathering definitions of human values from various philosophy scholars
Surveying representative samples of homo sapiens sapiens groups about descriptions of ideal systems that they can imagine
Developing a method for including non-human animals’ expressions of human values
Desirable? AI qualities
Summary
This post argues that desirable AI qualities include 1) Bayesian inference of its expected impacts on wellbeing across times and moral circles, 2) Finding and actualizing solutions to increase individuals’ wellbeing, 3) Motivating wellbeing focus among other systems that use AI, 4) Finding ‘the most good’ solutions enjoyable under the veil of ignorance using independently developing perspectives, and 5) Differentiating ‘human’ values from ‘inhumane’ goods based on inclusive ethics and morality research. This piece overviews each of these quality categories.
I thank Eleos Citrini for feedback on a draft of this post. All errors are mine.
Notes on level and definition of AI and normative type of alignment
These desirable? qualities relate to artificial intelligence that can outperform a single human or a coordinated group of humans in a specific task; AI agents that work with human agents, regardless of these groups’ relative performance to other ones in any metric; artificial general intelligence with agency that can independently shift the direction of global output; and artificial specific intelligence that functions as a learning step to more powerful machines.
Intelligence is defined as the ability to develop and advance solutions including by quantifying relevant concepts and gathering and processing necessary information.[1]
This is a maximalist framework: seeking to develop a ‘fully aligned’ system that continuously optimizes itself according to an improving understanding of human values rather than seeking to prevent a negative outcome based on some definition of such values.[2]
Desirable AI qualities
Bayesian inference of its expected impacts on wellbeing across times and moral circles. This includes:
Understanding advertisement impacts on viewers’ psychology
Accounting for human welfare impacts of different economic production decisions
Comprehending animal welfare effects of its actions and inactions
Predicting safety of systems that have various effects on individuals and can be developed by its decisions
Updating its understanding of wellbeing with new evidence
Finding and actualizing solutions to increase individuals’ wellbeing. For example, by:
Synthesizing relevant literature and ‘impersonating’ experts
Developing connections with key stakeholders by self-optimization
Gaining financial resources by market analysis
Organizing employees to actualize solutions
Gaining information necessary for intervention optimization
Motivating wellbeing focus among other systems that use AI. This can include:
Presenting success and sharing it at a good price
Detecting possibly harmful algorithm pieces and suggesting alternatives
Recommending purchases of AI with high harm/price ratio and changing or discarding them
Combining wellbeing effects with interests of other systems and sharing the two together
Finding ‘the most good’ solutions enjoyable under the veil of ignorance using independently developing perspectives. This may go beyond wellbeing (which should be fundamental), e. g. estimated by consciousness metrics. For instance:
Understanding fundamental needs of different humans and non-humans and advancing a system that continuously fulfills these needs
Stimulating and coordinating differentiated thinking regarding the most good and enabling a competition among various solutions which would not jeopardize any individuals’ wellbeing or needs fulfillment
Conducting spacetime research to gain further definitions of the most good and adding relevant solutions
Differentiating ‘human’ values from ‘inhumane’ goods based on inclusive ethics and morality research. For example:
Gathering definitions of human values from various philosophy scholars
Surveying representative samples of homo sapiens sapiens groups about descriptions of ideal systems that they can imagine
Developing a method for including non-human animals’ expressions of human values
Based on Artificial Intelligence, Values, and Alignment, p. 2: “‘intelligence’ is understood to refer to ‘an agent’s ability to achieve goals in a wide range of environments (Legg and Hutter 2007, 12).’”
Artificial Intelligence, Values, and Alignment, p. 3