[EA’s focus on marginal individual action over structure is a poor fit for dealing with info hazards.]
I tend to think that EAs sometimes are too focused on optimizing the marginal utility of individual actions as opposed to improving larger-scale structures. For example, I think it’d be good if there was much content and cultural awareness on how to build good organizations as there is on how to improve individual cognition. - Think about how often you’ve heard of “self improvement” or “rationality” as opposed to things like “organizational development”.
(Yes, this is similar to the good old ‘systemic change’ objection aimed at “what EAs tend to do in practice” rather than “what is implied by EAs’ normative views”.)
It occurred to me that one instance where this might bite in particular are info hazards.
I often see individual researchers agonizing about whether they can publish something they have written, which of several framings to use, and even which ideas are safe to mention in public. I do think that this can sometimes be really important, and that there are areas with a predictably high concentration of such cases, e.g. bio.
However, in many cases I feel like these concerns are far-fetched and poorly targeted.
They are far-fetched when they overestimate the effects a marginal publication by a non-prominent person can have on the world. E.g. the US government isn’t going to start an AGI project because you posted a thought on AI timelines on LessWrong.
They are poorly targeted when they focus on the immediate effects of marginal individual action. E.g., how much does my paper contribute to ‘AI capabilities’? What connotations will readers read into different terms I could use for the same concept?
On the other hand, in such cases often there are important info hazards in the areas researchers are working about. For example, I think it’s at least plausible that there is true information on, say, the prospects and paths to transformative AI, that would be bad to bring to the attention of, say, senior US or Chinese government officials.
It’s not the presence of these hazards but the connection with typical individual researcher actions that I find dubious. To address these concerns, rather than forward chaining from individual action one considers to take for other reasons, I suspect it’d be more fruitful to backward-chain from the location of large adverse effects (e.g. the US government starting an AGI project, if you think that’s bad). I suspect this would lead to a focus on structure for the analysis, and a focus on policy for solutions. Concretely, questions like:
What are the structural mechanisms for how information gets escalated to higher levels of seniority within, e.g., the US government or Alphabet?
Given current incentives, how many publications of potentially hazardous information do we expect, and through which channels?
What are mechanisms that can massively amplify the visibility of information? E.g. when will media consider something newsworthy, when and how do new academic subfields form?
[EA’s focus on marginal individual action over structure is a poor fit for dealing with info hazards.]
I tend to think that EAs sometimes are too focused on optimizing the marginal utility of individual actions as opposed to improving larger-scale structures. For example, I think it’d be good if there was much content and cultural awareness on how to build good organizations as there is on how to improve individual cognition. - Think about how often you’ve heard of “self improvement” or “rationality” as opposed to things like “organizational development”.
(Yes, this is similar to the good old ‘systemic change’ objection aimed at “what EAs tend to do in practice” rather than “what is implied by EAs’ normative views”.)
It occurred to me that one instance where this might bite in particular are info hazards.
I often see individual researchers agonizing about whether they can publish something they have written, which of several framings to use, and even which ideas are safe to mention in public. I do think that this can sometimes be really important, and that there are areas with a predictably high concentration of such cases, e.g. bio.
However, in many cases I feel like these concerns are far-fetched and poorly targeted.
They are far-fetched when they overestimate the effects a marginal publication by a non-prominent person can have on the world. E.g. the US government isn’t going to start an AGI project because you posted a thought on AI timelines on LessWrong.
They are poorly targeted when they focus on the immediate effects of marginal individual action. E.g., how much does my paper contribute to ‘AI capabilities’? What connotations will readers read into different terms I could use for the same concept?
On the other hand, in such cases often there are important info hazards in the areas researchers are working about. For example, I think it’s at least plausible that there is true information on, say, the prospects and paths to transformative AI, that would be bad to bring to the attention of, say, senior US or Chinese government officials.
It’s not the presence of these hazards but the connection with typical individual researcher actions that I find dubious. To address these concerns, rather than forward chaining from individual action one considers to take for other reasons, I suspect it’d be more fruitful to backward-chain from the location of large adverse effects (e.g. the US government starting an AGI project, if you think that’s bad). I suspect this would lead to a focus on structure for the analysis, and a focus on policy for solutions. Concretely, questions like:
What are the structural mechanisms for how information gets escalated to higher levels of seniority within, e.g., the US government or Alphabet?
Given current incentives, how many publications of potentially hazardous information do we expect, and through which channels?
What are mechanisms that can massively amplify the visibility of information? E.g. when will media consider something newsworthy, when and how do new academic subfields form?