Formalizing longtermism
Longtermism is defined as holding that “what most matters about our actions is their very long term effects”. What does this mean, formally? Below I set up a model of a social planner maximizing social welfare over all generations. With this model, we can give a precise definition of longtermism.
A model of a longtermist social planner
Consider an infinitely-lived representative agent with population size . In each period there is a risk of extinction via an extinction rate .
The basic idea is that economic growth is a double-edged sword: it increases our wealth, but also increases the risk of extinction. In particular, ‘consumption research’ develops new technologies , and these technologies increase both consumption and extinction risk.
Here are the production functions for consumption and consumption technologies:
However, we can also develop safety technologies to reduce extinction risk. Safety research produces new safety technologies , which are used to produce ‘safety goods’ .
Specifically,
The extinction rate is , where the number of consumption technologies directly increases risk, and the number of safety goods directly reduces it.
Let .
Now we can set up the social planner problem: choose the number of scientists (vs workers), the number of safety scientists (vs consumption scientists), and the number of safety workers (vs consumption workers) to maximize social welfare. That is, the planner is choosing an allocation of workers for all generations:
The social welfare function is:
The planner maximizes utility over all generations (), weighting by population size , and accounting for extinction risk via . The optimal allocation is the allocation that maximizes social welfare.
The planner discounts using (the Ramsey equation), where we have the discount rate , the exogenous extinction risk , risk-aversion (i.e., diminishing marginal utility), and the growth rate . (Note that could be time-varying.)
Here there is no pure time preference; the planner values all generations equally. Weighting by population size means that this is a total utilitarian planner.
Defining longtermism
With the model set up, now we can define longtermism formally. Recall the informal definition that “what most matters about our actions is their very long term effects”. Here are two ways that I think longtermism can be formalized in the model:
(1) The optimal allocation in our generation, , should be focused on safety work: the majority (or at least a sizeable fraction) of workers should be in safety research of production, and only a minority in consumption research or production. (Or, for small values of (say ) to capture that the next few generations need to work on safety.) This is saying that our time has high hingeyness due to existential risks. It’s also saying that safety work is currently uncrowded and tractable.
(2) Small deviations from (the optimal allocation in our generation) will produce large decreases in total social welfare , driven by generations (or some large number). In other words, our actions today have very large effects on the long-term future. We could plot against for and some suboptimal alternative , and show that is much smaller than in the tail.
While longtermism has an intuitive foundation (being intergenerationally neutral or having zero pure time preference), the commonly-used definition makes strong assumptions about tractability and hingeyness.
Further thoughts
This model focuses on extinction risk; another approach would look at trajectory changes.
Also, it might be interesting to incorporate Phil Trammell’s work on optimal timing/giving-now vs giving-later. Eg, maybe the optimal solution involves the planner saving resources to be invested in safety work in the future.
- 22 Apr 2022 16:48 UTC; 1 point) 's comment on How much current animal suffering does longtermism let us ignore? by (
Do you have any notion as to the solution to this model (for some reasonable parameter values)? I’ve tried to solve models like this one and haven’t succeeded, although I’m not good at differential equations.
It looks to me like it’s unsolvable without some nonzero exogenous extinction risk, because otherwise there will be multiple parameter choices that result in infinite utility, so you can’t say which one is best. But it’s not clear what rate of exogenous x-risk to use, and our distribution over possible values might still result in infinite utility in expectation.
Perhaps you could simplify the model by leaving out the concept of improving technology, and just say you can either spend on safety, spend on consumption, or invest to grow your capital. That might make the model easier to solve, and I don’t think it loses much explanatory power. (It would still have the infinity problem.)
Christian Tarsney has done a sensitivity analysis for the parameters in such a model in The Epistemic Challenge to Longtermism for GPI.
There’s also the possibility that the space we would otherwise occupy if we didn’t go extinct will become occupied by sentient individuals anyway, e.g. life reevolves, or aliens. These are examples of what Tarsney calls positive exogenous nullifying events, with extinction being a typical negative exogenous nullifying event.
There’s also the heat death of the universe, although it’s only a conjecture.
There are some approaches to infinite ethics that might allow you to rank some different infinite outcomes, although not necessarily all of them. See the overtaking criterion. These might make assumptions about order of summation, though, which is perhaps undesirable for an impartial consequentialist, and without such assumptions, conditionally convergent series can be made to sum to anything or diverge just by reordering them, which is not so nice.
My model here is riffing on Jones (2016); you might look there for solving the model.
Re infinite utility, Jones does say (fn 6): “As usual, ρ must be sufficiently large given growth so that utility is finite.”
I haven’t read most of GPI’s stuff on defining longtermism, but here are my thoughts. I think (2) is close to what I’d want for a definition of very strong longtermism—“the view on which long-run outcomes are of overwhelming importance”
I think we should be able to model longtermism using a simpler model than yours. Suppose you’re taking a one-off action d∈D, and then you get (discounted) reward r1(d),r2(d),… Then I’d say very strong longtermism is true iff the impact of each decisions depends overwhelmingly on their long-term impact.
∀d∈D,∑∞t=0rt(d)≈∑∞t=t′rt(d) where t′ is some large number.
You could stipulate that the discounted utility of the distant future has to be within a factor (1−ϵ)∑∞t=0rt(d)<∑∞t=t′rt(d)<(1+ϵ)∑∞t=0rt(d) , where ϵ∈(0,1). If you preferred, you could talk about the differences between utilities for all pairs of decisions, rather than the utility of each individual decision. Or small deviations from optimal. Or you could consider sequential decision-making, assuming that later decisions are made optimally. Or you assumed a distribution over D (e.g. the distribution of actual human decisions), and talk about the amount of variance in total utility explained by their long-term impact. But these are philosophical details—overall, we should land somewhere near your (2).
It’s not super clear to me that we want to formalise longtermism—“the ethical view that is particularly concerned with ensuring long-run outcomes go well”. If we did, it might say that sometimes ∑∞t=t′rt(d)−∑∞t=t′rt(d′) is big, or that it can sometimes outweigh other considerations.
Your (1) is interesting, but it doesn’t seem like a definition of longtermism. I’d call it something like safety investment is optimal, because it pertains to practical concerns about how to attain long-term utility.
Rather, I think it’d be more interesting to try to prove that follows from longtermism, given certain model assumptions (such as yours). To see what I have in mind, we could elaborate my setup. Setup: let the decision space be d∈[0,1] , where d represents the fraction of resources you invest in the long-term. Each rt,t≥t′ is an increasing function of d and each rt,t<t′ is a decreasing function of d. Then we could have a conjecture: Conjecture: if strong longtermism is true (for some t′ and ϵ), then the optimal action will be d=1 (or d>f(ϵ)), some function of ϵ). Proof: since we assume that only long-term impact matters, then the action with the best longterm impact d=1 is best overall.
Perhaps a weaker version could be proved in an economic model.
I like the simpler/more general model, although I think you should also take expectations, and allow for multiple joint probability distributions for the outcomes of a single action to reflect our deep uncertainty (and there’s also moral uncertainty, but I would deal with that separately on top of this). That almost all of the (change in) value happens in the longterm future isn’t helpful to know if we can’t predict which direction it goes.
Greaves and MacAskill define strong longtermism this way:
So this doesn’t say that most of the value of any given action is in the tail of the rewards; perhaps you can find some actions with negligible ex ante longterm consequences, e.g. examples of simple cluelessness.
Sure, that definition is interesting—seems optimised for advancing arguments about how to do practical ethical reasoning. I think a variation it would follow from mine—an ex-ante very good decision is contained in a set of options whose ex ante effects on the very long-run future are very good.
Still, it would be good to have a definition that generalises to suboptimal agents. Suppose that what’s long-term optimal for me is to work twelve hours a day, but it’s vanishingly unlikely that I’ll do that. Then what can longtermism do for an agent like me? It’d also make sense for us to be able to use longtermism to evaluate the actions of politicians, even if we don’t think any of the actions are long- or short-term optimal.
You could just restrict the set of options, or make the option the intention to follow through with the action, which may fail (and backfire, e.g. burnout), so adjust your expectations keeping failure in mind.
Or attach some probability of actually doing each action, and hold that for any positive EV shorttermist option, there’s a much higher EV longtermist option which isn’t much less likely to be chosen (it could be the same one for each, but need not be).
I’m a bit confused by this setup. Do you mean that d is analogous to L0, the allocation for t=0? If so, what are you assuming about Lt, for t>0? In my setup, I can compare U(¯L0,L∗1,L∗2,...) toU(L∗0,L∗1,L∗2,...). , so we’re comparing against the optimal allocation, holding fixed L∗t for t>0.
I’m not sure this works. Consider: this condition would also be satisfied in a world with no x-risk, where each generation becomes successively richer and happier, and there’s no need for present generations to care about improving the future. (Or are you defining rt(d) as the marginal utility of d on generation t, as opposed to the utility level of generation t under d?)
d is a one-off action taken at t=0 whose effects accrue over time, analogous to L. (I could be wrong, but I’m proposing that the”long-term” in longtermism refers to utility obtained at different times, not actions taken at different times, so removing the latter helps bring the definition of longtermism into focus.
Is what you’re saying that actions could vary on their short-term-goodness and long-term goodness, such that short/long-term goodness are perfectly correlated? To me, this is a world where longtermism is true—we can tell an agent’s value from its long-term value, and also a world where shorttermism is true. Generations only need to care about the future if longtermism works but other heuristics fail. To your question, rt(d) is just the utility at time t under d.
In my setup, I could say ∫∞t=0MtNtu(ct)e−ρtdt≈∫∞t=TMtNtu(ct)e−ρtdt for some large T; ie, generations 0 to T−1 contribute basically nothing to total social utility . But I don’t think this captures longtermism, because this is consistent with the social planner allocating no resources to safety work (and all resources to consumption of the current generation); the condition puts no constraints on L∗. In other words, this condition only matches the first of three criteria that Will lists:
Interesting—defining longtermism as rectifying future disprivelege. This is different from what I was trying to model. Honestly, it seems different from all the other definitions. Is this the sort of longtermism that you want to model?
If I was trying to model this, I would want to make reference to a baseline level of disparity, given inaction, and then consider how a (possibly causal) intervention could improve that.
Do you think Will’s three criteria are inconsistent with the informal definition I used in the OP (“what most matters about our actions is their very long term effects”)?
Not inconsistent, but I think Will’s criteria are just one of many possible reasons that this might be the case.
On Will’s definition, longtermism and shorttermism are mutually exclusive.
For another model and analysis, check out Christian Tarsney’s The Epistemic Challenge to Longtermism.
I know this is a bit contrarian but I’m asking just out of interest as opposed to critique. What’s your thoughts on Neartermism and might it also require formalizing in order to provide a clear opposing theory to strength both Longtermism and Neartermism research?
I think of neartermism vs longtermism as disagreements over (a) tractability and crowdedness of longtermist interventions, or (b) time preference. (Though I don’t think many EAs agree with nonzero pure time preference.)