In this post you avoided giving any concrete examples, but I wanted to brainstorm what some of the major decisions are.
The decision to run or deploy a particular model, on a particular day, in a particular way. (e.g. OpenAI released ChatGPT to the public on Nov 30, 2022.) This is a decision made by possibly a single engineer, or a team of engineers or by management.
The decision to pursue, or not pursue, a particular technology, idea or method. This is a decision made by engineers, researchers, management and grant-makers.
The decision to reveal, or not reveal, certain information to the public. The information could be source code, model weights, a whitepaper, or even the existence of a particular capability. This is a decision made by engineers, researchers and management.
The choice of a particular reward function or loss function, if that is part of the model. This is a decision made by engineers and researchers.
The choice of particular hardware, such as CPUs vs GPUs vs TPUs vs future neuromorphic hardware. Depending on your perspective, this is a decision made by researchers, by chip companies like NVidia and TSMC, by market forces (gaming & crypto), or by mother nature (some technologies are just more practical than others).
The tastes of the public & the market. For example, the public has responded strongly to AI art and chatbots in the last year, but in years past the public was not impressed enough by either technology to use them on a daily basis or consider them impactful. This is a kind of collective decision we all make, and it impacts how management makes choices. For another example, if the public strongly wanted ChatGPT to be completely uncensored and offensive, OpenAI would have made different choices when building their RLHF system.
The setting of laws and regulations related to AI. This is a decision made by politicians, clerks, lobbyists and activists.
The setting of business policy related to what the AI is “allowed” to do according to business policy. (eg ChatGPT will refuse to engage on certain topics, although this is highly hackable.) This is a decision made by management, under pressure from the public, politicians, activists, investors etc.
The setting of business policy related to who is allowed to access the AI (eg the general public) and what they are allowed to do with it.
The choice of reaction in the event that an AI is behaving badly—ie do they intervene, modify the model, shut it down, etc. This is a decision made by engineers and management, but their choices will likely be highly influenced by the alignment community, especially if there are prepared plans.
The decision to prepare a plan. Someone, perhaps from this community, might make plans in the event of certain circumstances, such as an AI that is clearly behaving badly. These plans might be helpful in an emergency. This is a decision made by the alignment community, management and researchers.
The decision by the alignment community to communicate in particular ways with particular people, eg having a long private conversation with Sam Altman, or publicly appearing on a podcast to discuss alignment. This decision will influence people’s thinking, especially that of the most important decision makers.
The decision to research particular alignment concepts, eg Agent Foundations, Shard Theory, the stop button problem, etc. This is a decision made by the alignment community and researchers.
[Comment crossposted from LessWrong]
In this post you avoided giving any concrete examples, but I wanted to brainstorm what some of the major decisions are.
The decision to run or deploy a particular model, on a particular day, in a particular way. (e.g. OpenAI released ChatGPT to the public on Nov 30, 2022.) This is a decision made by possibly a single engineer, or a team of engineers or by management.
The decision to pursue, or not pursue, a particular technology, idea or method. This is a decision made by engineers, researchers, management and grant-makers.
The decision to reveal, or not reveal, certain information to the public. The information could be source code, model weights, a whitepaper, or even the existence of a particular capability. This is a decision made by engineers, researchers and management.
The choice of a particular reward function or loss function, if that is part of the model. This is a decision made by engineers and researchers.
The choice of particular hardware, such as CPUs vs GPUs vs TPUs vs future neuromorphic hardware. Depending on your perspective, this is a decision made by researchers, by chip companies like NVidia and TSMC, by market forces (gaming & crypto), or by mother nature (some technologies are just more practical than others).
The tastes of the public & the market. For example, the public has responded strongly to AI art and chatbots in the last year, but in years past the public was not impressed enough by either technology to use them on a daily basis or consider them impactful. This is a kind of collective decision we all make, and it impacts how management makes choices. For another example, if the public strongly wanted ChatGPT to be completely uncensored and offensive, OpenAI would have made different choices when building their RLHF system.
The setting of laws and regulations related to AI. This is a decision made by politicians, clerks, lobbyists and activists.
The setting of business policy related to what the AI is “allowed” to do according to business policy. (eg ChatGPT will refuse to engage on certain topics, although this is highly hackable.) This is a decision made by management, under pressure from the public, politicians, activists, investors etc.
The setting of business policy related to who is allowed to access the AI (eg the general public) and what they are allowed to do with it.
The choice of reaction in the event that an AI is behaving badly—ie do they intervene, modify the model, shut it down, etc. This is a decision made by engineers and management, but their choices will likely be highly influenced by the alignment community, especially if there are prepared plans.
The decision to prepare a plan. Someone, perhaps from this community, might make plans in the event of certain circumstances, such as an AI that is clearly behaving badly. These plans might be helpful in an emergency. This is a decision made by the alignment community, management and researchers.
The decision by the alignment community to communicate in particular ways with particular people, eg having a long private conversation with Sam Altman, or publicly appearing on a podcast to discuss alignment. This decision will influence people’s thinking, especially that of the most important decision makers.
The decision to research particular alignment concepts, eg Agent Foundations, Shard Theory, the stop button problem, etc. This is a decision made by the alignment community and researchers.
Feel free to add to this list.