Towards More Modular Value/Impact Estimates
TLDR; This post explains why I think we should be more explicit about our values when making impact estimates/ value predictions and provides some example suggestions of how to do this.
There has been a string of recent posts discussing and predicting characteristics (EV, Variance, etc) about future value. (How Binary is Longterm Value?, The Future Might Not Be So Great, Parfit + Singer + Aliens = ?, shameless plug , etc. )
Moreover, estimating the “impact” of interventions is a central theme in this community. It is perhaps the core mission of Effective Altruism.
Most of the time I see posts discussing impact/value, I don’t see a definition of value.[1] What we define value to mean (sometimes called, ethics, morality, etc.) is the function that converts material outcomes (the is) into a number for which, bigger = better (the ought).
If someone makes a post engaging in value estimating and doesn’t define value, it seems like there are two likely possibilities.
Most people engaging with the post will use their own internal notion of value.
Most people will engage with the post using what they perceive to be the modal value in the community, so probably total utilitarianism.
I believe these are both sub-optimal outcomes. I do not believe most people engaging with these posts are trying to actively grapple with meta-ethics, so in the first place, they might not care to talk through the fact that they have different internal notions of value. More importantly, the ability to identify and isolate cruxes is central to rationality. We should always aim to modularize our discussions as this clarifies disagreements in the moment and allows the conclusions of the conversation to be much more plug-and-pull in the future. On some questions of impact, it could be the case that the answer to the question is not a function of the value system we use. But I think this is incredibly unlikely[2] and anyway we should explicitly come to that conclusion rather than assume it.
If the second outcome, at least most of us would be on the same page. Of course, not everyone would be on the same page. Also, it isn’t like total utilitarianism is clearly defined. You still need to give utility a useable definition, you need to create a weighting rule or map for sentient beings, and you need to define if there is such a thing as negative lives (and if yes, where the line is), etc.[3] So you would still have a lesser version of the above point. Plus, we then have also created an environment with a de facto ethic, which doesn’t seem like a good vibe to me.
Suggestions
Primary suggestion: Write your definition of value in your bio, and if you don’t clarify in your comment/post, people should default to using this definition of value. I’m not sure there is an easily generalizable blueprint for all ethical systems, but here is an example of what a utilitarian version might look like (not my actual values). Note that this could probably be fleshed out more and/or better but I don’t think it matters for the purpose of this post.
BIO
Ethical Framework: Total Utilitarianism
Definition of Utility: QALYS, but rescaled so that quality of life can dip negative
Weighting functions: Amount of neurons
Additional Clarifications: I believe this is implicit in my weighting function but I consider future and digital minds to be morally valuable. My definition of a neuron is (....). I would prefer to use my Coherent Extrapolated Volition over my current value system.
Other suggestions I like less:
Suggestion: Define value in your question/comment post. [4]
Suggestion: Make a certain form of total utilitarianism the de jure meaning of value on the forum when people don’t clearly define value or don’t set a default value in their bio.[5]
Suggestion: Don’t do impact estimates in one go, do output/outcome estimates. Then extrapolate separately. I.E. ask questions like “How many QALYs will there be in the future” “How many human rights will be violated” etc.
- ^
Sometimes I will see something like “my ethics are suffering focused, so this leads me x instead of y”.
- ^
If we think of morality as being an arbitrary map that takes the world as an input and spits out a (real) number, then it is an arbitrary map from or to R, where F some set (Technically, the dimensions of the universe are not necessarily comprised of the same sets so this notation is wrong, plus I don’t actually have any idea what I’m talking about). If this is the case, we can basically make the “morality map” do whatever we want. So when asking questions about how the value of the world will end up looking, we can almost certainly create two maps(moralities) that will spit out very different answers for the same world.
- ^
I understand how strict of a bar clarifying these things every post would be, and I don’t think we need to be strict about it, but we should keep these things in mind and push towards a world where we are communicating this information.
- ^
This seems laborious
- ^
We can of course make it explicit that we don’t endorse this, and it is just a discussion norm. I would still understand if people feel this opens us up to reputational harms and thus is a bad idea.
This post might be of interest: Five steps for quantifying speculative interventions.
Cool I’ll check it out.