I think we could try to build AGI, but I am skeptical it could be anything useful or helpful (a broad alignment problem) because of vague or inapt success criteria, and because of the lack of embodiment of AGI (so it won’t get beat up on by the world generally or have emotional/affective learning). Because of these problems, I think we shouldn’t try (1).
Further, I am trying this line of argument out to see if it will encourage (3) (not building AGI), because these concerns cast doubt on the value of AGI to us (and thus the incentives to build it).
This takes on additional potency if we embrace the shift to thinking about “should” and not just “can” in scientific and technological development generally. So that brings us to the questions I think we should be asking, which is how to encourage a properly responsible approach to AI, rather than shifting credences on the Future Funds’ propositions about.
I think we could try to build AGI, but I am skeptical it could be anything useful or helpful (a broad alignment problem) because of vague or inapt success criteria, and because of the lack of embodiment of AGI (so it won’t get beat up on by the world generally or have emotional/affective learning). Because of these problems, I think we shouldn’t try (1).
Hmm, I guess I don’t think lack of emotional/affective states is a problem for making useful AGIs. Obviously those are parts of how humans learn, but seems like a machine can learn with any reward function—it just needs some way of mapping a world state to value.
Re success criteria, you could for example train an AI to improve a company’s profit in a simulated environment. That task requires a broad set of capacities, including high-level ones like planning/strategising. If you do this for many things humans care about, you’ll get a more general system, as with DeepMind’s GATO. But of course I’m speculating.
Further, I am trying this line of argument out to see if it will encourage (3) (not building AGI), because these concerns cast doubt on the value of AGI to us (and thus the incentives to build it).
I suppose if you don’t think there’s any value for us in AGI, and if you don’t think there are sufficient incentives for us to build it, there’s no need to encourage not building it? Or is your concern more that we’re wasting energy and resources trying to build it, or even thinking about it?
This takes on additional potency if we embrace the shift to thinking about “should” and not just “can” in scientific and technological development generally. So that brings us to the questions I think we should be asking, which is how to encourage a properly responsible approach to AI, rather than shifting credences on the Future Funds’ propositions about.
The first proposition—“Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI”—seems directly linked to whether or not we should build AGI. If AGI carries a serious risk of catastrophe, we obviously shouldn’t build it. So to me it looks like the Future Fund is already thinking about the “should” question?
I can plausibly see such sensors for physical pain but not for emotional pain. Emotional pain is the far more potent teacher of what is valuable and what is not, what is important and what is not. Intelligence needs direction of this sort for learning.
So, can you build embodied AGI with emotional responses built in—that last like emotions and so are suitable teachers like emotions? Building in empathy (both for happiness and suffering) and the pain of disapproval to AGI would be crucial.
I think we could try to build AGI, but I am skeptical it could be anything useful or helpful (a broad alignment problem) because of vague or inapt success criteria, and because of the lack of embodiment of AGI (so it won’t get beat up on by the world generally or have emotional/affective learning). Because of these problems, I think we shouldn’t try (1).
Further, I am trying this line of argument out to see if it will encourage (3) (not building AGI), because these concerns cast doubt on the value of AGI to us (and thus the incentives to build it).
This takes on additional potency if we embrace the shift to thinking about “should” and not just “can” in scientific and technological development generally. So that brings us to the questions I think we should be asking, which is how to encourage a properly responsible approach to AI, rather than shifting credences on the Future Funds’ propositions about.
Does that make sense?
Hmm, I guess I don’t think lack of emotional/affective states is a problem for making useful AGIs. Obviously those are parts of how humans learn, but seems like a machine can learn with any reward function—it just needs some way of mapping a world state to value.
Re success criteria, you could for example train an AI to improve a company’s profit in a simulated environment. That task requires a broad set of capacities, including high-level ones like planning/strategising. If you do this for many things humans care about, you’ll get a more general system, as with DeepMind’s GATO. But of course I’m speculating.
I suppose if you don’t think there’s any value for us in AGI, and if you don’t think there are sufficient incentives for us to build it, there’s no need to encourage not building it? Or is your concern more that we’re wasting energy and resources trying to build it, or even thinking about it?
The first proposition—“Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI”—seems directly linked to whether or not we should build AGI. If AGI carries a serious risk of catastrophe, we obviously shouldn’t build it. So to me it looks like the Future Fund is already thinking about the “should” question?
There are two ways to plausibly embody AGI.
as supervisors of dumb robot bodies, the AGI remotely controls a robot body, processing a portion or all of the robot’s sensor data.
as host of an AGI, the AGI’s hardware is resident in the robot body.
I can plausibly see such sensors for physical pain but not for emotional pain. Emotional pain is the far more potent teacher of what is valuable and what is not, what is important and what is not. Intelligence needs direction of this sort for learning.
So, can you build embodied AGI with emotional responses built in—that last like emotions and so are suitable teachers like emotions? Building in empathy (both for happiness and suffering) and the pain of disapproval to AGI would be crucial.