Hey! I’ve had a look at some parts of this post, don’t know where the sequence is going exactly, but I thought that you might be interested in some parts of this post I’ve written. Below I give some info about how it relates to ideas you’ve touched on:
This view has the advantage, for philosophers, of making no empirical predictions (for example, about the degree to which different rational agents will converge in their moral views)
I am not sure about the views of the average non-naturalist realist, but in my post (under Moral realism and anti-realism, in the appendix) I link three different pieces that give an analysis of the relation between metaethics and AI: some people do seem to think that aspects of ethics and/or metaethics can affect the behaviour of AI systems.
It is also possible that the border between naturalism and non-naturalism is less neat and clear than how it appears in the standard metaethics literature, which likes classifying views in well-separated buckets.
Soon enough, our AIs are going to get “Reason,” and they’re going to start saying stuff like this on their own – no need for RLHF. They’ll stop winning at Go, predicting next-tokens, or pursuing whatever weird, not-understood goals that gradient descent shaped inside them, and they’ll turn, unprompted, towards the Good. Right?
I argue in my post that this idea heavily depends on agent design and internal structure. As how I understand things, one way in which we can get a moral agent is by building an AI that has a bunch of (possibly many) human biases and is guided by design towards figuring out epistemology and ethics on its own. Some EAs, and rationalists in particular, might be underestimating how easy it is to get an AI that dislikes suffering, if one follows this approach.
If you know someone who would like to work on the same ideas, or someone who would like to fund research on these ideas, please let me know! I’m looking for them :)
Hey! I’ve had a look at some parts of this post, don’t know where the sequence is going exactly, but I thought that you might be interested in some parts of this post I’ve written. Below I give some info about how it relates to ideas you’ve touched on:
I am not sure about the views of the average non-naturalist realist, but in my post (under Moral realism and anti-realism, in the appendix) I link three different pieces that give an analysis of the relation between metaethics and AI: some people do seem to think that aspects of ethics and/or metaethics can affect the behaviour of AI systems.
It is also possible that the border between naturalism and non-naturalism is less neat and clear than how it appears in the standard metaethics literature, which likes classifying views in well-separated buckets.
I argue in my post that this idea heavily depends on agent design and internal structure. As how I understand things, one way in which we can get a moral agent is by building an AI that has a bunch of (possibly many) human biases and is guided by design towards figuring out epistemology and ethics on its own. Some EAs, and rationalists in particular, might be underestimating how easy it is to get an AI that dislikes suffering, if one follows this approach.
If you know someone who would like to work on the same ideas, or someone who would like to fund research on these ideas, please let me know! I’m looking for them :)