I haven’t read most of GPI’s stuff on defining longtermism, but here are my thoughts. I think (2) is close to what I’d want for a definition of very strong longtermism—“the view on which long-run outcomes are of overwhelming importance”
I think we should be able to model longtermism using a simpler model than yours. Suppose you’re taking a one-off action d∈D, and then you get (discounted) reward r1(d),r2(d),… Then I’d say very strong longtermism is true iff the impact of each decisions depends overwhelmingly on their long-term impact.
∀d∈D,∑∞t=0rt(d)≈∑∞t=t′rt(d) where t′ is some large number.
You could stipulate that the discounted utility of the distant future has to be within a factor (1−ϵ)∑∞t=0rt(d)<∑∞t=t′rt(d)<(1+ϵ)∑∞t=0rt(d) , where ϵ∈(0,1). If you preferred, you could talk about the differences between utilities for all pairs of decisions, rather than the utility of each individual decision. Or small deviations from optimal. Or you could consider sequential decision-making, assuming that later decisions are made optimally. Or you assumed a distribution over D (e.g. the distribution of actual human decisions), and talk about the amount of variance in total utility explained by their long-term impact. But these are philosophical details—overall, we should land somewhere near your (2).
It’s not super clear to me that we want to formalise longtermism—“the ethical view that is particularly concerned with ensuring long-run outcomes go well”. If we did, it might say that sometimes ∑∞t=t′rt(d)−∑∞t=t′rt(d′) is big, or that it can sometimes outweigh other considerations.
Your (1) is interesting, but it doesn’t seem like a definition of longtermism. I’d call it something like safety investment is optimal, because it pertains to practical concerns about how to attain long-term utility.
Rather, I think it’d be more interesting to try to prove that follows from longtermism, given certain model assumptions (such as yours). To see what I have in mind, we could elaborate my setup. Setup: let the decision space be d∈[0,1] , where d represents the fraction of resources you invest in the long-term. Each rt,t≥t′ is an increasing function of d and each rt,t<t′ is a decreasing function of d. Then we could have a conjecture: Conjecture: if strong longtermism is true (for some t′ and ϵ), then the optimal action will be d=1 (or d>f(ϵ)), some function of ϵ). Proof: since we assume that only long-term impact matters, then the action with the best longterm impact d=1 is best overall.
Perhaps a weaker version could be proved in an economic model.
I like the simpler/more general model, although I think you should also take expectations, and allow for multiple joint probability distributions for the outcomes of a single action to reflect our deep uncertainty (and there’s also moral uncertainty, but I would deal with that separately on top of this). That almost all of the (change in) value happens in the longterm future isn’t helpful to know if we can’t predict which direction it goes.
Let strong longtermism be the thesis that in a wide class of decision situations, the option that is ex ante best is contained in a fairly small subset of options whose ex ante effects on the very long-run future are best.
So this doesn’t say that most of the value of any given action is in the tail of the rewards; perhaps you can find some actions with negligible ex ante longterm consequences, e.g. examples of simple cluelessness.
Sure, that definition is interesting—seems optimised for advancing arguments about how to do practical ethical reasoning. I think a variation it would follow from mine—an ex-ante very good decision is contained in a set of options whose ex ante effects on the very long-run future are very good.
Still, it would be good to have a definition that generalises to suboptimal agents. Suppose that what’s long-term optimal for me is to work twelve hours a day, but it’s vanishingly unlikely that I’ll do that. Then what can longtermism do for an agent like me? It’d also make sense for us to be able to use longtermism to evaluate the actions of politicians, even if we don’t think any of the actions are long- or short-term optimal.
You could just restrict the set of options, or make the option the intention to follow through with the action, which may fail (and backfire, e.g. burnout), so adjust your expectations keeping failure in mind.
Or attach some probability of actually doing each action, and hold that for any positive EV shorttermist option, there’s a much higher EV longtermist option which isn’t much less likely to be chosen (it could be the same one for each, but need not be).
Suppose you’re taking a one-off action d∈D, and then you get (discounted) reward r1(d),r2(d),…
I’m a bit confused by this setup. Do you mean that d is analogous to L0, the allocation for t=0? If so, what are you assuming about Lt, for t>0? In my setup, I can compare U(¯L0,L∗1,L∗2,...) toU(L∗0,L∗1,L∗2,...). , so we’re comparing against the optimal allocation, holding fixed L∗t for t>0.
∀d∈D,∑∞t=0rt(d)≈∑∞t=t′rt(d) where t′ is some large number.
I’m not sure this works. Consider: this condition would also be satisfied in a world with no x-risk, where each generation becomes successively richer and happier, and there’s no need for present generations to care about improving the future. (Or are you defining rt(d) as the marginal utility of d on generation t, as opposed to the utility level of generation t under d?)
d is a one-off action taken at t=0 whose effects accrue over time, analogous to L. (I could be wrong, but I’m proposing that the”long-term” in longtermism refers to utility obtained at different times, not actions taken at different times, so removing the latter helps bring the definition of longtermism into focus.
This condition would also be satisfied in a world with no x-risk, where each generation becomes successively richer and happier, and there’s no need for present generations to care about improving the future.
Is what you’re saying that actions could vary on their short-term-goodness and long-term goodness, such that short/long-term goodness are perfectly correlated? To me, this is a world where longtermism is true—we can tell an agent’s value from its long-term value, and also a world where shorttermism is true. Generations only need to care about the future if longtermism works but other heuristics fail. To your question, rt(d) is just the utility at time t under d.
In my setup, I could say ∫∞t=0MtNtu(ct)e−ρtdt≈∫∞t=TMtNtu(ct)e−ρtdt for some large T; ie, generations 0 to T−1 contribute basically nothing to total social utility . But I don’t think this captures longtermism, because this is consistent with the social planner allocating no resources to safety work (and all resources to consumption of the current generation); the condition puts no constraints on L∗. In other words, this condition only matches the first of three criteria that Will lists:
(i) Those who live at future times matter just as much, morally, as those who live today;
(ii) Society currently privileges those who live today above those who will live in the future; and
(iii) We should take action to rectify that, and help ensure the long-run future goes well.
Interesting—defining longtermism as rectifying future disprivelege. This is different from what I was trying to model. Honestly, it seems different from all the other definitions. Is this the sort of longtermism that you want to model?
If I was trying to model this, I would want to make reference to a baseline level of disparity, given inaction, and then consider how a (possibly causal) intervention could improve that.
Do you think Will’s three criteria are inconsistent with the informal definition I used in the OP (“what most matters about our actions is their very long term effects”)?
I haven’t read most of GPI’s stuff on defining longtermism, but here are my thoughts. I think (2) is close to what I’d want for a definition of very strong longtermism—“the view on which long-run outcomes are of overwhelming importance”
I think we should be able to model longtermism using a simpler model than yours. Suppose you’re taking a one-off action d∈D, and then you get (discounted) reward r1(d),r2(d),… Then I’d say very strong longtermism is true iff the impact of each decisions depends overwhelmingly on their long-term impact.
∀d∈D,∑∞t=0rt(d)≈∑∞t=t′rt(d) where t′ is some large number.
You could stipulate that the discounted utility of the distant future has to be within a factor (1−ϵ)∑∞t=0rt(d)<∑∞t=t′rt(d)<(1+ϵ)∑∞t=0rt(d) , where ϵ∈(0,1). If you preferred, you could talk about the differences between utilities for all pairs of decisions, rather than the utility of each individual decision. Or small deviations from optimal. Or you could consider sequential decision-making, assuming that later decisions are made optimally. Or you assumed a distribution over D (e.g. the distribution of actual human decisions), and talk about the amount of variance in total utility explained by their long-term impact. But these are philosophical details—overall, we should land somewhere near your (2).
It’s not super clear to me that we want to formalise longtermism—“the ethical view that is particularly concerned with ensuring long-run outcomes go well”. If we did, it might say that sometimes ∑∞t=t′rt(d)−∑∞t=t′rt(d′) is big, or that it can sometimes outweigh other considerations.
Your (1) is interesting, but it doesn’t seem like a definition of longtermism. I’d call it something like safety investment is optimal, because it pertains to practical concerns about how to attain long-term utility.
Rather, I think it’d be more interesting to try to prove that follows from longtermism, given certain model assumptions (such as yours). To see what I have in mind, we could elaborate my setup. Setup: let the decision space be d∈[0,1] , where d represents the fraction of resources you invest in the long-term. Each rt,t≥t′ is an increasing function of d and each rt,t<t′ is a decreasing function of d. Then we could have a conjecture: Conjecture: if strong longtermism is true (for some t′ and ϵ), then the optimal action will be d=1 (or d>f(ϵ)), some function of ϵ). Proof: since we assume that only long-term impact matters, then the action with the best longterm impact d=1 is best overall.
Perhaps a weaker version could be proved in an economic model.
I like the simpler/more general model, although I think you should also take expectations, and allow for multiple joint probability distributions for the outcomes of a single action to reflect our deep uncertainty (and there’s also moral uncertainty, but I would deal with that separately on top of this). That almost all of the (change in) value happens in the longterm future isn’t helpful to know if we can’t predict which direction it goes.
Greaves and MacAskill define strong longtermism this way:
So this doesn’t say that most of the value of any given action is in the tail of the rewards; perhaps you can find some actions with negligible ex ante longterm consequences, e.g. examples of simple cluelessness.
Sure, that definition is interesting—seems optimised for advancing arguments about how to do practical ethical reasoning. I think a variation it would follow from mine—an ex-ante very good decision is contained in a set of options whose ex ante effects on the very long-run future are very good.
Still, it would be good to have a definition that generalises to suboptimal agents. Suppose that what’s long-term optimal for me is to work twelve hours a day, but it’s vanishingly unlikely that I’ll do that. Then what can longtermism do for an agent like me? It’d also make sense for us to be able to use longtermism to evaluate the actions of politicians, even if we don’t think any of the actions are long- or short-term optimal.
You could just restrict the set of options, or make the option the intention to follow through with the action, which may fail (and backfire, e.g. burnout), so adjust your expectations keeping failure in mind.
Or attach some probability of actually doing each action, and hold that for any positive EV shorttermist option, there’s a much higher EV longtermist option which isn’t much less likely to be chosen (it could be the same one for each, but need not be).
I’m a bit confused by this setup. Do you mean that d is analogous to L0, the allocation for t=0? If so, what are you assuming about Lt, for t>0? In my setup, I can compare U(¯L0,L∗1,L∗2,...) toU(L∗0,L∗1,L∗2,...). , so we’re comparing against the optimal allocation, holding fixed L∗t for t>0.
I’m not sure this works. Consider: this condition would also be satisfied in a world with no x-risk, where each generation becomes successively richer and happier, and there’s no need for present generations to care about improving the future. (Or are you defining rt(d) as the marginal utility of d on generation t, as opposed to the utility level of generation t under d?)
d is a one-off action taken at t=0 whose effects accrue over time, analogous to L. (I could be wrong, but I’m proposing that the”long-term” in longtermism refers to utility obtained at different times, not actions taken at different times, so removing the latter helps bring the definition of longtermism into focus.
Is what you’re saying that actions could vary on their short-term-goodness and long-term goodness, such that short/long-term goodness are perfectly correlated? To me, this is a world where longtermism is true—we can tell an agent’s value from its long-term value, and also a world where shorttermism is true. Generations only need to care about the future if longtermism works but other heuristics fail. To your question, rt(d) is just the utility at time t under d.
In my setup, I could say ∫∞t=0MtNtu(ct)e−ρtdt≈∫∞t=TMtNtu(ct)e−ρtdt for some large T; ie, generations 0 to T−1 contribute basically nothing to total social utility . But I don’t think this captures longtermism, because this is consistent with the social planner allocating no resources to safety work (and all resources to consumption of the current generation); the condition puts no constraints on L∗. In other words, this condition only matches the first of three criteria that Will lists:
Interesting—defining longtermism as rectifying future disprivelege. This is different from what I was trying to model. Honestly, it seems different from all the other definitions. Is this the sort of longtermism that you want to model?
If I was trying to model this, I would want to make reference to a baseline level of disparity, given inaction, and then consider how a (possibly causal) intervention could improve that.
Do you think Will’s three criteria are inconsistent with the informal definition I used in the OP (“what most matters about our actions is their very long term effects”)?
Not inconsistent, but I think Will’s criteria are just one of many possible reasons that this might be the case.
On Will’s definition, longtermism and shorttermism are mutually exclusive.