Jasmine Brazilek
Progress may be possible, but CaML doesn’t have the technical background to make progress on determining how consciousness works, so we leave that to others.
Our current work in this space is on measuring whether AIs take the possibility of consciousness seriously (without being overconfident in one direction or another). So we’re measuring observable behaviors of giving statements and actions inconsistent with believing that AI welfare is clearly impossible or that current AIs are definitely conscious. I agree that current methods can provide at best weak and heavily debatable findings (for the reasons the linked post articulates), though I think that’s importantly different from precisely zero evidence.
In science it’s usually a good instinct to dismiss something this unclear, but there are two issues with that in this case (and some others): First, the issue is enormously important if true. Second, the philosophical difficulty of artificial consciousness means that our current confusion doesn’t provide Bayesian evidence either way: we’d expect ourselves to have basically these opinions in worlds where artificial consciousness is the default and also worlds where it’s impossible.
I definitely agree and am grateful for your opinion. I am not interested in consciousness research, but do believe there is tractability into the idea of AIs causing digital-mind suffering without attempting to solve the consciousness debate.
Thanks Michael, we avoided mentioning post-training to imply that “new paradigm needed” would also count on the “disagree” side of the spectrum. In other words, “disagree” on this question would mean either “post-training is sufficient” or “new paradigms are needed/sufficient”.
[Question] Community Polls on Alignment Controversies
This is really cool work! Is there a graph you can show summarizing what the agents were doing turn after turn in this simulation? Is there anything that would validate this is common sense behavior and you have made a reasonable simulation here?
I also disagree with the conclusion here. Yes, it’s hard to measure so we shouldn’t assume we’ll never be able to measure it! Also all AI values research is dependent on the model training regimes too. For the precautionary principle we should act as though they have welfare until we can see clear evidence against that. Thoughtful post though so thanks for that.