Hi Jack! Wonderful to hear that you’ve been reading up on all these sources already!
Rethink Priorities has identified lots of markers that we can draw on to get a bit of a probabilistic idea of whether invertebrates are sentient. I wonder which of these might carry over to digital sentience. (It’s probably hard to arrive at strong opinions on this, but if we did, I’d also be worried that those could be infohazardous.) The concept of reinforcement learning (testable through classic conditioning) is a marker that I think is particularly fundamental. When I talk about sentience, I typically mean positive and negative feedback or phenomenal consciousness. That is intimately tied to reinforcement learning because an agent has no reason to value or disvalue certain feedback unless it is inherently un-/desirable to the agent. This doesn’t need to be pain or stress (just as we also can correct someone without causing them pain or stress) and it’s unclear how intense it is anyway, but at least when classic conditioning behavior is present, I’m extra cautious and when it’s absent less worried that the system might be conscious.
You’ve probably seen Tobias’s typology of s-risks. I’m particularly worried about agential s-risks where the AI, though it might not have phenomenal consciousness itself, creates beings that do, such as emulations of animal brains. But there are also incidental s-risks, which are worrying particularly if the AI ends up in the situation where it has to create a lot of aligned subagents, e.g., because it has expanded a lot and is incurring communication delays. But generally I think you’ll hear the most convincing arguments in 1:1 conversations with people from CLR, CRS, probably MIRI, and others.
Hi Jack! Wonderful to hear that you’ve been reading up on all these sources already!
Rethink Priorities has identified lots of markers that we can draw on to get a bit of a probabilistic idea of whether invertebrates are sentient. I wonder which of these might carry over to digital sentience. (It’s probably hard to arrive at strong opinions on this, but if we did, I’d also be worried that those could be infohazardous.) The concept of reinforcement learning (testable through classic conditioning) is a marker that I think is particularly fundamental. When I talk about sentience, I typically mean positive and negative feedback or phenomenal consciousness. That is intimately tied to reinforcement learning because an agent has no reason to value or disvalue certain feedback unless it is inherently un-/desirable to the agent. This doesn’t need to be pain or stress (just as we also can correct someone without causing them pain or stress) and it’s unclear how intense it is anyway, but at least when classic conditioning behavior is present, I’m extra cautious and when it’s absent less worried that the system might be conscious.
You’ve probably seen Tobias’s typology of s-risks. I’m particularly worried about agential s-risks where the AI, though it might not have phenomenal consciousness itself, creates beings that do, such as emulations of animal brains. But there are also incidental s-risks, which are worrying particularly if the AI ends up in the situation where it has to create a lot of aligned subagents, e.g., because it has expanded a lot and is incurring communication delays. But generally I think you’ll hear the most convincing arguments in 1:1 conversations with people from CLR, CRS, probably MIRI, and others.