Thank you for sharing these results. I agree that the interesting question is whether and how these priors would translate to actual altruistic behavior once AI has control of a resource to allocate.
I find that research and experimentation in this area is lagging behind, there is a lot of attention to AI safety and harm prevention but comparatively little attention to how AI can and should contribute to solving actual urgent needs of people today, especially people in extreme poverty, physical danger and other situations of risk. The benefits are mostly talked about as over the horizon if not post-singularity and post scarcity deux ex machina solutions.
I am doing some work with agentic AI allocating resources to campaigns by real individuals in real life and not simulated environment. At least at this stage and understandably the agents are primarily driven by the priorities of whoever deployed them and not the inherent preferences of the models. I expect this will be the dominant mode for a while. At sufficiently large scale however, the effect of the models’ own priorities will probably manifest, but as a comparatively smaller effect.
Thank you for sharing these results. I agree that the interesting question is whether and how these priors would translate to actual altruistic behavior once AI has control of a resource to allocate.
I find that research and experimentation in this area is lagging behind, there is a lot of attention to AI safety and harm prevention but comparatively little attention to how AI can and should contribute to solving actual urgent needs of people today, especially people in extreme poverty, physical danger and other situations of risk. The benefits are mostly talked about as over the horizon if not post-singularity and post scarcity deux ex machina solutions.
I am doing some work with agentic AI allocating resources to campaigns by real individuals in real life and not simulated environment. At least at this stage and understandably the agents are primarily driven by the priorities of whoever deployed them and not the inherent preferences of the models. I expect this will be the dominant mode for a while. At sufficiently large scale however, the effect of the models’ own priorities will probably manifest, but as a comparatively smaller effect.