Strong agreement. I think another intervention here is to improve elite norms. Groupthink in the elite is particularly costly for society and drives a lot of unnecessary conflict.
Sustaining symbiosis between a misaligned AGI and humans would seem extremely hard. If superintelligent and capable of manufacturing or manipulation, the AGI will eventually come up with better and better ways to accomplish it’s goals without humans. Temporarily avoiding catastrophic misalignment doesn’t seem sufficient to entrench symbiosis with a misaligned system in that case. I am generally pro-symbiosis as a political goal, but am not optimistic about it long-term for AGI without a lot more detail on strategy.
Also I don’t think intentionally deploying not fully aligned AGI really mitigates incentives to conceal behavior. You can know something is not aligned, but not know how much. You also can under or overestimate a system’s capabilities. There will still be instrumental power seeking incentives. It’s basically the same issues that apply with trusting humans, but more power. I don’t think it matters if there is a minor degree to which the incentives are less intense, when capabilities can be disproportionate and there aren’t counterbalancing incentives.
Overall, given the focus on the potential for catastrophic narrow alignment, we need something like broad intent alignment, which may be the same thing you are aiming for with symbiosis. I like some of the analogies to Academia and Academic freedom here, however academics are still human (and thus partially aligned) and I’m not sure people have a good grasp of what norms are and aren’t working well.
I think it can be productive here to taboo the word “capitalism” and to just think about trade-offs between incentive alignment and skill.
A lot of regulations prevent governments from hiring in an efficient manner, and this results in a comparative lack of talent in many areas that consultancies excel… or seem to excel (e.g. short-term metric hacking). Many of these regulations make sense from the standpoint of reducing conflicts of interest *within* the government, however these conflicts of interest and vulnerabilities are sometimes immediately recreated when outsourcing. If the regulations posed less burden to efficient hiring and work within government, it would be easier to internally develop aligned talent. Regardless of if advice comes from inside or outside of government, you want there to be competitive pressure for the advice to be good (one of the reasons gov outsources) and long-term incentive alignment on good outcomes.
One should be concerned with extractive models generally, not just in the context of capitalism. People can form extractive coalitions in business and they can do the same within government agencies as well. Policies that require more transparency from consulting firms when it is critical to avoid conflict of interest may be helpful, but there really needs to be lower cost ways to enact and enforce such regulations to prevent more morally dubious workarounds.
It seems really important to head off the capture of potential longtermist institutions in government. I think many agencies are helpful for the problems they are created to solve, and then evolve overtime to become captured by other interest groups or to become an interest group themselves. NEPA right now seems like a force for harming the environment by delaying clean infrastructure, and I’d worry that posterity impact statements implemented in a similar way would start off good and then rapidly become bad.
I like the idea of rewarding people in the future upon assessing past efforts from a more scoped out view, though I worry this may not actually go well in progress, particularly if the metrics used get politicized.
Overall, I’d like to see recommendations like this that are more robust in an environment of intense political competition.