Is emotional valence a particularly confused and particularly high-leverage topic, and one that might plausibly be particularly conductive getting clarity on? I think it would be hard to argue in the negative on the first two questions. Resolving the third question might be harder, but I’d point to our outputs and increasing momentum. I.e. one can levy your skepticism on literally any cause, and I think we hold up excellently in a relative sense. We may have to jump to the object-level to say more.
I don’t think I follow. Getting more clarity on emotional valence does not seem particularly high-leverage to me. What’s the argument that it is?
To your second concern, I think a lot about AI and ‘order of operations’. … But might there be path-dependencies here such that the best futures happen if we gain more clarity on consciousness, emotional valence, the human nervous system, the nature of human preferences, and so on, before we reach certain critical thresholds in superintelligence development and capacity? Also — certainly.
Certainly? I’m much less sure. I actually used to think something like this; in particular, I thought that if we didn’t program our AI to be good at philosophy, it would come to some wrong philosophical view about what consciousness is (e.g. physicalism, which I think is probably wrong) and then kill us all while thinking it was doing us a favor by uploading us (for example).
But now I think that programming our AI to be good at philosophy should be tackled directly, rather than indirectly by first solving philosophical problems ourselves and then programming the AI to know the solutions. For one thing, it’s really hard to solve millenia-old philosophical problems in a decade or two. For another, there are many such problems to solve. Finally, our AI safety schemes probably won’t involve feeding answers into the AI, so much as trying to get the AI to learn our reasoning methods and so forth, e.g. by imitating us.
Widening the lens a bit, qualia research is many things, and one of these things is an investment in the human-improvement ecosystem, which I think is a lot harder to invest effectively in (yet also arguably more default-safe) than the AI improvement ecosystem. Another ‘thing’ qualia research can be thought of as being is an investment in Schelling point exploration, and this is a particularly valuable thing for AI coordination.
I don’t buy these claims yet. I guess I buy that qualia research might help improve humanity, but so would a lot of other things, e.g. exercise and nutrition. As for the Schelling point exploration thing, what does that mean in this context?
I’m confident that, even if we grant that the majority of humanity’s future trajectory will be determined by AGI trajectory — which seems plausible to me — I think it’s also reasonable to argue that qualia research is one of the highest-leverage areas for positively influencing AGI trajectory and/or the overall AGI safety landscape.
Your points about sufficiently advanced AIs obsoleting human philosophers are well-taken, though I would touch back on my concern that we won’t have particular clarity on philosophical path-dependencies in AI development without doing some of the initial work ourselves, and these questions could end up being incredibly significant for our long-term trajectory — I gave a talk about this for MCS that I’ll try to get transcribed (in the meantime I can share my slides if you’re interested). I’d also be curious to flip your criticism and ping your models for a positive model for directing EA donations — is the implication that there are no good places to donate to, or that narrow-sense AI safety is the only useful place for donations? What do you think the highest-leverage questions to work on are? And how big are your ‘metaphysical uncertainty error bars’? What sorts of work would shrink these bars?
Sorry for the delayed reply! Didn’t notice this until now.
Sure, I’d be happy to see your slides, thanks! Looking at your post on FAI and valence, it looks like reasons no. 3, 4, 5, and 9 are somewhat plausible to me. I also agree that there might be philosophical path-dependencies in AI development and that doing some of the initial work ourselves might help to discover them—but I feel like QRI isn’t aimed at this directly and could achieve this much better if it was; if it happens it’ll be a side-effect of QRI’s research.
For your flipped criticism:
--I think bolstering the EA community and AI risk communities is a good idea —I think “blue sky” research on global priorities, ethics, metaphilosophy, etc. is also a good idea if people seem likely to make progress on it —Obviously I think AI safety, AI governance, etc. are valuable —There are various other things that seem valuable because they support those things, e.g. trying to forecast decline of collective epistemology and/or prevent it.
--There are various other things that don’t impact AI safety but independently have a decently strong case that they are similarly important, e.g. ALLFED or pandemic preparedness.
--I’m probably missing a few things —My metaphysical uncertainty… If you mean how uncertain am I about various philosophical questions like what is happiness, what is consciousness, etc., then the answer is “very uncertain.” But I think the best thing to do is not try to think about it directly now, but rather to try to stabilize the world and get to the Long Reflection so we can think about it longer and better later.
Thanks for your back and forth. After finishing my Master‘s I had an offer for a PhD position in consciousness (& meditation) and decided against this because arguments close to yours, Daniel.
I agree that we probably shouldn’t aim at solving philosophy and feeding this to an AI, but I wonder if one could make a stronger case along something like: at some point advanced AI systems will come into contact with philosophical problems, and the better and the more of them we humans understand at the time the AI was designed, the better the chances of building aligned systems that can take over responsibility responsibly. Maybe one could think about (fictional?) cultures that have for some reason never explored Utilitarianism much but still are technologically highly developed. I suppose I’d think they’d have somewhat worse chances of building aligned systems, though I don’t trust my intuitions much here.
Well said. I agree that that is a path to impact for the sort of work QRI is doing, it just seems lower-priority to me than other things like working on AI alignment or AI governance. Not to mention the tractability / neglectedness concerns (philosophy is famously intractable, and there’s an entire academic discipline for it already)
I don’t think I follow. Getting more clarity on emotional valence does not seem particularly high-leverage to me. What’s the argument that it is?
Certainly? I’m much less sure. I actually used to think something like this; in particular, I thought that if we didn’t program our AI to be good at philosophy, it would come to some wrong philosophical view about what consciousness is (e.g. physicalism, which I think is probably wrong) and then kill us all while thinking it was doing us a favor by uploading us (for example).
But now I think that programming our AI to be good at philosophy should be tackled directly, rather than indirectly by first solving philosophical problems ourselves and then programming the AI to know the solutions. For one thing, it’s really hard to solve millenia-old philosophical problems in a decade or two. For another, there are many such problems to solve. Finally, our AI safety schemes probably won’t involve feeding answers into the AI, so much as trying to get the AI to learn our reasoning methods and so forth, e.g. by imitating us.
I don’t buy these claims yet. I guess I buy that qualia research might help improve humanity, but so would a lot of other things, e.g. exercise and nutrition. As for the Schelling point exploration thing, what does that mean in this context?
I’m interested to hear those arguments!
Hi Daniel,
Thanks for the reply! I am a bit surprised at this:
The quippy version is that, if we’re EAs trying to maximize utility, and we don’t have a good understanding of what utility is, more clarity on such concepts seems obviously insanely high-leverage. I’ve written about specific relevant to FAI here: https://opentheory.net/2015/09/fai_and_valence/ Relevance to building a better QALY here: https://opentheory.net/2015/06/effective-altruism-and-building-a-better-qaly/ And I discuss object-level considerations on how better understanding of emotional valence could lead to novel therapies for well-being here: https://opentheory.net/2018/08/a-future-for-neuroscience/ https://opentheory.net/2019/11/neural-annealing-toward-a-neural-theory-of-everything/ On mobile, pardon the formatting.
Your points about sufficiently advanced AIs obsoleting human philosophers are well-taken, though I would touch back on my concern that we won’t have particular clarity on philosophical path-dependencies in AI development without doing some of the initial work ourselves, and these questions could end up being incredibly significant for our long-term trajectory — I gave a talk about this for MCS that I’ll try to get transcribed (in the meantime I can share my slides if you’re interested). I’d also be curious to flip your criticism and ping your models for a positive model for directing EA donations — is the implication that there are no good places to donate to, or that narrow-sense AI safety is the only useful place for donations? What do you think the highest-leverage questions to work on are? And how big are your ‘metaphysical uncertainty error bars’? What sorts of work would shrink these bars?
Sorry for the delayed reply! Didn’t notice this until now.
Sure, I’d be happy to see your slides, thanks! Looking at your post on FAI and valence, it looks like reasons no. 3, 4, 5, and 9 are somewhat plausible to me. I also agree that there might be philosophical path-dependencies in AI development and that doing some of the initial work ourselves might help to discover them—but I feel like QRI isn’t aimed at this directly and could achieve this much better if it was; if it happens it’ll be a side-effect of QRI’s research.
For your flipped criticism:
--I think bolstering the EA community and AI risk communities is a good idea
—I think “blue sky” research on global priorities, ethics, metaphilosophy, etc. is also a good idea if people seem likely to make progress on it
—Obviously I think AI safety, AI governance, etc. are valuable
—There are various other things that seem valuable because they support those things, e.g. trying to forecast decline of collective epistemology and/or prevent it.
--There are various other things that don’t impact AI safety but independently have a decently strong case that they are similarly important, e.g. ALLFED or pandemic preparedness.
--I’m probably missing a few things
—My metaphysical uncertainty… If you mean how uncertain am I about various philosophical questions like what is happiness, what is consciousness, etc., then the answer is “very uncertain.” But I think the best thing to do is not try to think about it directly now, but rather to try to stabilize the world and get to the Long Reflection so we can think about it longer and better later.
Thanks for your back and forth. After finishing my Master‘s I had an offer for a PhD position in consciousness (& meditation) and decided against this because arguments close to yours, Daniel.
I agree that we probably shouldn’t aim at solving philosophy and feeding this to an AI, but I wonder if one could make a stronger case along something like: at some point advanced AI systems will come into contact with philosophical problems, and the better and the more of them we humans understand at the time the AI was designed, the better the chances of building aligned systems that can take over responsibility responsibly. Maybe one could think about (fictional?) cultures that have for some reason never explored Utilitarianism much but still are technologically highly developed. I suppose I’d think they’d have somewhat worse chances of building aligned systems, though I don’t trust my intuitions much here.
Well said. I agree that that is a path to impact for the sort of work QRI is doing, it just seems lower-priority to me than other things like working on AI alignment or AI governance. Not to mention the tractability / neglectedness concerns (philosophy is famously intractable, and there’s an entire academic discipline for it already)