The “immeasurability” of the future that Vaden has highlighted has nothing to do with the literal finiteness of the timeline of the universe. It has to do, rather, with the set of all possible futures (which is provably infinite). This set is immeasurable in the mathematical sense of lacking sufficient structure to be operated upon with a well-defined probability measure. Let me turn the question around on you: Suppose we knew that the time-horizon of the universe was finite, can you write out the sample space, $\sigma$-algebra, and measure which allows us to compute over possible futures?
I can see two possible types of arguments here, which are importantly different.
Arguments aiming to show that there can be no probability measure—or at least no “non-trivial” one—on some relevant set such as the set of all possible futures.
Arguments aiming to show that, among the many probability measures that can be defined on some relevant set, there is no, or no non-arbitrary way to identify a particular one.
[ETA: In this comment, which I hadn’t seen before writing mine, Vaden seems to confirm that they were trying to make an argument of the second rather than the first kind.]
In this comment I’ll explain why I think both types of arguments would prove too much and thus are non-starters. In other comments I’ll make some more technical points about type 1 and type 2 arguments, respectively.
(I split my points between comments so the discussion can be organized better and people can use up-/downvotes in a more fine-grined way)
I’m doing this largely because I’m worried that to some readers the technical language in Vaden’s post and your comment will suggest that longtermism specifically faces some deep challenges that are rooted in advanced mathematics. But in fact I think that characterization would be seriously mistaken (at least regarding the issues you point to). Instead, I think that the challenges either have little to do with the technical results you mention or that the challenges are technical but not specific to longtermism.
[After writing I realized that the below has a lot of overlap with what Owen and Elliot have written earlier. I’m still posting it because there are slight differences and there is no harm in doing so, but people who read the previous discussions may not want to read this.]
Both types of arguments prove too much because they (at least based on the justifications you’ve given in the post and discussion here) are not specific to longtermism at all. They would e.g. imply that I can’t have a probability distribution over how many guests will come to my Christmas party tomorrow, which is absurd.
To see this, note that everything you say would apply in a world that ends in two weeks, or to deliberations that ignore any effects after that time. In particular, it is still true that the set of these possible ‘short futures’ is infinite (my house mate could enter the room any minute and shout any natural number), and that the possible futures contains things that, like your example of a sequence of black and white balls, have no unique ‘natural’ structure or measure (e.g. the collection of atoms in a certain part of my table, or the types of possible items on that table).
So these arguments seem to show that we can never meaningfully talk about the probability of any future event, whether it happens in a minute or in a trillion years. Clearly, this is absurd.
Now, there is a defence against this argument, but I think this defence is just as available to the longtermist as it is to (e.g.) me when thinking about the number of guests at my Christmas party next week.
This defence is that for any instance of probabilistic reasoning about the future we can simply ignore most possible futures, and in fact only need to reason over specific properties of the future. For instance, when thinking about the number of guests to my Christmas party, I can ignore people shouting natural numbers or the collection of objects on my table—I don’t need to reason about anything close to a complete or “low-level” (e.g. in terms of physics) description of the future. All I care about is a single natural number—the number of guests—and each number corresponds to a huge set of futures at the level of physics.
But this works for many if not all longtermist cases as well! The number of people in one trillions years is a natural number, as is the year in which transformative AI is being developed, etc. Whether or not identifying the relevant properties, or the probability measure we’re adopting, is harder than for typical short-term cases—and maybe prohibitively hard—is an interesting and important question. But it’s an empirical question, not one we should expect to answer by appealing to mathematical considerations around the cardinality or measurability of certain sets.
Separately, there may be an interesting question about how I’m able to identify the high-level properties I’m reasoning about—whether that high-level property is the number of people coming to my party or the number of people living in a trillion years. How do I know I “should pay attention” only to the number of party guests and not which natural numbers they may be shouting? And how am I able to “bridge” between more low-level descriptions of futures (e.g. a list of specific people coming to the party, or a video of the party, or even a set of initial conditions plus laws of motion for all relevant elementary particles)? There may be interesting questions here, but I think these are questions for philosophy or psychology who in my view aren’t particularly illuminated by referring to concepts from measure theory. (And again, they aren’t specific to longtermism.)
Technical comments on type-1 arguments (those aiming to show there can be no probability measure). [Refer to the parent comment for the distinction between type 1 and type 2 arguments.]
I basically don’t see how such an argument could work. Apologies if that’s totally clear to you and you were just trying to make a type-2 argument. However, I worry that some readers might come away with the impression that there is a viable argument of type 1 since Vaden and you mention issues of measurability and infinite cardinality. These relate to actual mathematical results showing that for certain sets, measures with certain properties can’t exist at all.
However, I don’t think this is relevant to the case you describe. And I also don’t think it can be salvaged for an argument against longtermism.
First, in what sense can sets be “immeasurable”? The issue can arise in the following situation. Suppose we have some set (in this context “sample space”—think of the elements at all possible instances of things that can happen at the most fine-grained level), and some measure (in this context “probability”—but it could also refer to something we’d intuitively call length or volume) we would like to assign to some subsets (the subsets in this context are “events”—e.g. the event that Santa Clause enters my room now is represented by the subset containing all instances with that property).
In this situation, it can happen that there is no way to extend this measure to all subsets.
The classic example here is the real line as base set. We would like a measure that assigns measure |a−b| to each interval [a,b] (the set of real numbers from a to b), thus corresponding to our intuitive notion of length. E.g. the interval [−1,3] should have length 4.
Thus we have to limit ourselves to assigning a measure to only some subsets. (In technical terms: we have to use a σ-algebra that’s strictly smaller than the full set of all subsets.) In other words, there are some subsets the measure of which we have to leave undefined. Those are immeasurable sets.
Second, why don’t I think this will be a problem in this context?
At the highest level, note that even if we are in a context with immeasurable sets this does not mean that we get no (probability) measure at all. It just means that the measure won’t “work” for all subsets/events. So for this to be an objection to longtermism, we would need a further argument for why specific events we care about are immeasurable—or in other words, why we can’t simply limit ourselves to the set of measurable events.
Note that immeasurable sets, to the extent that we can describe them concretely at all, are usually highly ‘weird’. If you try to google for pictures of standard examples like Vitali sets you won’t find a single one because we essentially can’t visualize them. Indeed, by design every set that we can construct from intervals by countably many standard operations like intersections and unions is measurable. So at least in the case of the real numbers, we arguably won’t encounter immeasurable sets “in practice”.
Note also that the phenomenon of immeasurable sets enables a number of counterintuitive results, such as the Banach-Tarski theorem. Loosely speaking this theorem says we can cut up a ball into pieces, and then by moving around those pieces and reassembling them get a ball that has twice the volume of the original ball; so for example “a pea can be chopped up and reassembled into the Sun”.
But usually the conclusion we draw from this is not that it’s meaningless to use numbers to refer to the coordinates of objects in space, or that our notion of volume is meaningless and that “we cannot measure the volume of objects” (and to the extent there is a problem it doesn’t exclusively apply to particularly large objects—just as any problem relevant to predicting the future wouldn’t specifically apply to longtermism). At most, we might wonder whether our model of space as continuous in real-number coordinates “breaks down” in certain edge cases, but we don’t think that this invalidates pragmatic uses of this model that never use its full power (in terms of logical implications).
Immeasurable subsets are a phenomenon intimately tied to uncountable sets—i.e. ones that are even “larger” than the natural numbers (for instance, the real numbers are uncountable, but the rational numbers are not). This is roughly because the relevant concepts like σ-algebras and measures are defined in terms of countably many operations like unions or sums; and if you “fix” the measure of some sets in a way that’s consistent at all, then you can uniquely extend this to all sets you can get from those by taking complements and countable intersections and unions. In particular, if in a countable set you fix the measure of all singleton sets containing just one element, then this defines a unique measure on the set of all subsets.
Your examples of possible futures where people shout different natural numbers involve only countable sets. So it’s hard to see how we’d get any problem with immeasurable sets there.
You might be tempted to modify the example to argue that the set of possible futures is uncountably infinite because it contains people shouting all real numbers. However, (i) it’s not clear if it’s possible for people to shout any real number, (ii) even if it is then all my other remarks still apply, so I think this wouldn’t be a problem, certainly none specific to longtermism.
Regarding (i), the problem is that there is no general way to refer to an arbitraryreal number within a finite window of time. In particular, I cannot “shout” an infinite and non-period decimal expansion; nor can I “shout” a sequence of rational numbers that converges to the real number I want to refer to (except maybe in a few cases where the sequence is a closed-form function of n).
More generally, if utterances are individuated by the finite sequence of words I’m using, then (assuming a finite alphabet) there are only countably many possible utterances I can make. If that’s right then I cannot refer to an arbitrary real number precisely because there are “too many” of them.
Similarly, the set of all sequences of black or white balls is uncountable, but it’s unclear whether we should think that it’s contained in the set of all possible futures.
More importantly: if there were serious problems due to immeasurable sets—whether with longtermism or elsewhere—we could retreat to reasoning about a countable subset. For instance, if I’m worried that predicting the development of transformative AI is problematic because “time from now” is measured in real numbers, I could simply limit myself to only reasoning about rational numbers of (e.g.) seconds from now.
There may be legitimate arguments for this response being ‘ad hoc’ or otherwise problematic. (E.g. perhaps I would want to use properties of rational numbers that can only be proven by using real numbers “within the proof”.) But especially given the large practical utility of reasoning about e.g. volumes of space or probabilities of future events, I think it at least shows that immeasurability can’t ground a decisive knock-down argument.
However, rather than the argument depending too much on contingent properties of the world (e.g. whether it’s spatially infinite), the issue here is that they would depend on the axiomatization of mathematics.
The situation is roughly as follows: There are two different axiomatizations of mathematics with the following properties:
In both of them all maths that any of us are likely to ever “use in practice” works basically the same way.
For parallel situations (i.e. assignments of measure to some subsets of some set, which we’d like to extend to a measure on all subsets) there are immeasurable subsets in exactly one of the axiomatizations.
Specifically, for example, for our intuitive notion of “length” there are immeasurable subsets of the real numbers in the standard axiomatization of mathematics (called ZFC here). However, if we omit a single axiom—the axiom of choice—and replace it with an axiom that loosely says that there are weirdly large sets then every subset of the real numbers is measurable. [ETA: Actually it’s a bit more complicated, but I don’t think in a way that matters here. It doesn’t follow directly from these other axioms that everything is measurable, but using these axioms it’s possible to construct a “model of mathematics” in which that holds. Even less importantly, we don’t totally omit the axiom of choice but replace it with a weaker version.]
I think it would be pretty strange if the viability of longtermism depended on such considerations. E.g. imagine writing a letter to people in 1 million years explaining why you didn’t choose to try to help more rather than fewer of them. Or imagine getting such a letter from the distant past. I think I’d be pretty annoyed if I read “we considered helping you, but then we couldn’t decide between the axiom of choice and inaccessible cardinals …”.
Technical comments on type-2 arguments (i.e. those that aim to show there is no, or no non-arbitrary way for us to identify a particular probability measure.) [Refer to the parent comment for the distinction between type 1 and type 2 arguments.]
I think this is closer to the argument Vaden was aiming to make despite the somewhat nonstandard use of “measurable” (cf. my comment on type 1 arguments for what measurable vs. immeasurable usually refers to in maths), largely because of this part (emphasis mine) [ETA: Vaden also confirms this in this comment, which I hadn’t seen before writing my comments]:
But don’t we apply probabilities to infinite sets all the time? Yes—to measurable sets. A measure provides a unique method of relating proportions of infinite sets to parts of itself, and this non-arbitrariness is what gives meaning to the notion of probability. While the interval between 0 and 1 has infinitely many real numbers, we know how these relate to each other, and to the real numbers between 1 and 2.
Some comments:
Yes, we need to be more careful when reasoning about infinite sets since some of our intuitions only apply to finite sets. Vaden’s ball reshuffling example and the “Hilbert’s hotel” thought experiment they mention are two good examples for this.
However, the ball example only shows that one way of specifying a measure no longer works for infinite sample spaces: we can no longer get a measure by counting how many instances a subset (think “event”) consists of and dividing this by the number of all possible samples because doing so might amount to dividing infinity by infinity.
But this need not be problematic. There are a lot of other ways for specifying measures, for both finite and infinite sets. In particular, we don’t have to rely on some ‘mathematical structure’ on the set we’re considering (as in the examples of real numbers that Vaden is giving) or other a priori considerations; when using probabilities for practical purposes, our reasons for using a particular measure will often be tied to empirical information.
For example, suppose I have a coin in my pocket, and I have empirical reasons (perhaps based on past observations, or perhaps I’ve seen how the coin was made) to think that a flip of that coin results in heads with probability 60% and tails with probability 40%. When reasoning about this formally, I might write down {H, T} as sample space, the set of all subsets as σ-algebra, and the unique measure μ with μ({H})=0.6.
But this is not because there is any general sense in which the set {H,T} is more “measurable” than the set of all sequences of black or white balls. Without additional (e.g. empirical) context, there is no non-arbitrary way to specify a measure on either set. And with suitable context, there will often be a ‘natural’ or ‘unique’ measure for either because the arbitrariness is defeated by the context.
This works just as well when I have no “objective” empirical data. I might simply have a gut feeling that the probability of heads is 60%, and be willing to e.g. accept bets corresponding to that belief. Someone might think that that’s foolish if I don’t have any objective data and thus bet against me. But it would be a pretty strange objection to say that me giving a probability of 60% is meaningless, or that I’m somehow not able or not allowed to enter such bets.
This works just as well for infinite sample spaces. For example, I might have a single radioactive atom in front of me, and ask myself when it will decay. For instance, I might want to know the probability that this atom will decay within the next 10 minutes. I won’t be deterred by the observation that I can’t get this probability by counting the number of “points in time” in the next 10 minutes and divide them by the total number of points in time. (Nor should I use ‘length’ as derived from the structure of the real numbers, and divide 10 by infinity to conclude that the probability is zero.) I will use an exponential distribution—a probability distribution on the real numbers which, in this context, is non-arbitrary: I have good reasons to use it and not some other distribution.
Note that even if we could get the probability by counting it would be the wrong one because the probability that the atom decays isn’t uniform. Similarly, if I have reasons to think that my coin is biased, I shouldn’t calculate probabilities by naive counting using the set {H,T}. Overall, I struggle to see how the availability of a counting measure is important to the question whether we can identify a “natural” or “unique” measure.
More generally, we manage to identify particular probability measures to use on both finite and infinite sample spaces all the time, basically any time we use statistics for real-world applications. And this is not because we’re dealing with particularly “measurable” or otherwise mathematically special sample spaces, and despite the fact that there are lots of possible probability measures that we could use.
Again, I do think there may be interesting questions here: How do we manage to do this? But again, I think these are questions for psychology or philosophy that don’t have to do with the cardinality or measurability of sets.
Similarly, I think that looking at statistical practice suggests that your challenge of “can you write down the measure space?” is a distraction rather than pointing to a substantial problem. In practice we often treat particular probability distributions as fundamental (e.g. we’re assuming that something is normally distributed with certain parameters) without “looking under the hood” at the set-theoretic description of random variables. For any given application where we want to use a particular distribution, there are arbitrarily many ways to write down a measure space and a random variable having that distribution; but usually we only care about the distribution and not these more fundamental details, and so aren’t worried by any “non-uniqueness” problem.
The most viable anti-longtermist argument I could see in the vicinity would be roughly as follows:
Argue that there is some relevant contingent (rather than e.g. mathematical) difference between longtermist and garden-variety cases.
Probably one would try to appeal to something like the longtermist cases being more “complex” relative to our reasoning and computational capabilities.
One could also try an “argument from disagreement”: perhaps our use of probabilities when e.g. forecasting the number of guests to my Christmas party is justified simply by the fact that ~everyone agrees how to do this. By contrast, in longtermist cases, maybe we can’t get such agreement.
Argue that this difference makes a difference for whether we’re justified to use subjective probabilities or expected values, or whatever the target of the criticism is supposed to be.
But crucially, I think mathematical features of the objects we’re dealing with when talking about common practices in a formal language are not where we can hope to find support for such an argument. This is because the longtermist and garden-variety cases don’t actually differ relevantly regarding these features.
Instead, I think the part we’d need to understand is not why there might be a challenge, but how and why in garden-variety cases we’re able to overcome that challenge. Only then can we assess whether these—or other—“methods” are also available to the longtermist.
Hi Max! Again, I agree the longtermist and garden-variety cases may not actually differ regarding the measure-theoretic features in Vaden’s post, but some additional comments here.
But it would be a pretty strange objection to say that me giving a probability of 60% is meaningless, or that I’m somehow not able or not allowed to enter such bets.
Although “probability of 60%” may be less meaningful than we’d like / expect, you are certainly allowed to enter such bets. In fact, someone willing to take the other side suggests that he/she disagrees. This highlights the difficulty of converging on objective probabilities for future outcomes which aren’t directly subject to domain-specific science (e.g. laws of planetary motion). Closer in time, we might converge reasonably closely on an unambiguous measure, or appropriate parametric statistical model.
Regarding the “60% probability” for future outcomes, a useful thought experiment for me was how I might reason about the risk profile of bets made on open-ended future outcomes. I quickly become less convinced I’m estimating meaningful risk the further out I go. Further, we only run the future once, so it’s hard to actually confirm our probability is meaningful (as for repeated coin flips). We could make longtermist bets by transferring $ btwn our far-future offspring, but can’t tell who comes out on top “in expectation” beyond simple arbitrages.
This defence is that for any instance of probabilistic reasoning about the future we can simply ignore most possible futures
Honest question being new to EA… is it not problematic to restrict our attention to possible futures or aspects of futures which are relevant to a single issue at a time? Shouldn’t we calculate Expected Utility over billion year futures for all current interventions, and set our relative propensity for actions = exp{α * EU } / normalizer ?
For example, the downstream effects of donating to Anti-Malaria would be difficult to reason about, but we are clueless as to whether its EU would be dwarfed by AI safety on the billion yr timescale, e.g. bringing the entire world out of poverty limiting political risk leading to totalitarian government.
Honest question being new to EA… is it not problematic to restrict our attention to possible futures or aspects of futures which are relevant to a single issue at a time? Shouldn’t we calculate Expected Utility over billion year futures for all current interventions, and set our relative propensity for actions = exp{α * EU } / normalizer ?
Yes, I agree that it’s problematic. We “should” do the full calculation if we could, but in fact we can’t because of our limited capacity for computation/thinking.
But note that in principle this situation is familiar. E.g. a CEO might try to maximize the long-run profits of her company, or a member of government might try to design a healthcare policy that maximizes wellbeing. In none of these cases are we able to do the “full calculation”, albeit my a less dramatic margin than for longtermism.
And we don’t think that the CEO’s or the politician’s effort are meaningless or doomed or anything like that. We know that they’ll use heuristics, simplified models, or other computational shortcuts; we might disagree with them which heuristics and models to use, and if repeatedly queried with “why?” both they and we would come to a place where we’d struggle to justify some judgment call or choice of prior or whatever. But that’s life—a familiar situation and one we can’t get out of.
I can see two possible types of arguments here, which are importantly different.
Arguments aiming to show that there can be no probability measure—or at least no “non-trivial” one—on some relevant set such as the set of all possible futures.
Arguments aiming to show that, among the many probability measures that can be defined on some relevant set, there is no, or no non-arbitrary way to identify a particular one.
[ETA: In this comment, which I hadn’t seen before writing mine, Vaden seems to confirm that they were trying to make an argument of the second rather than the first kind.]
In this comment I’ll explain why I think both types of arguments would prove too much and thus are non-starters. In other comments I’ll make some more technical points about type 1 and type 2 arguments, respectively.
(I split my points between comments so the discussion can be organized better and people can use up-/downvotes in a more fine-grined way)
I’m doing this largely because I’m worried that to some readers the technical language in Vaden’s post and your comment will suggest that longtermism specifically faces some deep challenges that are rooted in advanced mathematics. But in fact I think that characterization would be seriously mistaken (at least regarding the issues you point to). Instead, I think that the challenges either have little to do with the technical results you mention or that the challenges are technical but not specific to longtermism.
[After writing I realized that the below has a lot of overlap with what Owen and Elliot have written earlier. I’m still posting it because there are slight differences and there is no harm in doing so, but people who read the previous discussions may not want to read this.]
Both types of arguments prove too much because they (at least based on the justifications you’ve given in the post and discussion here) are not specific to longtermism at all. They would e.g. imply that I can’t have a probability distribution over how many guests will come to my Christmas party tomorrow, which is absurd.
To see this, note that everything you say would apply in a world that ends in two weeks, or to deliberations that ignore any effects after that time. In particular, it is still true that the set of these possible ‘short futures’ is infinite (my house mate could enter the room any minute and shout any natural number), and that the possible futures contains things that, like your example of a sequence of black and white balls, have no unique ‘natural’ structure or measure (e.g. the collection of atoms in a certain part of my table, or the types of possible items on that table).
So these arguments seem to show that we can never meaningfully talk about the probability of any future event, whether it happens in a minute or in a trillion years. Clearly, this is absurd.
Now, there is a defence against this argument, but I think this defence is just as available to the longtermist as it is to (e.g.) me when thinking about the number of guests at my Christmas party next week.
This defence is that for any instance of probabilistic reasoning about the future we can simply ignore most possible futures, and in fact only need to reason over specific properties of the future. For instance, when thinking about the number of guests to my Christmas party, I can ignore people shouting natural numbers or the collection of objects on my table—I don’t need to reason about anything close to a complete or “low-level” (e.g. in terms of physics) description of the future. All I care about is a single natural number—the number of guests—and each number corresponds to a huge set of futures at the level of physics.
But this works for many if not all longtermist cases as well! The number of people in one trillions years is a natural number, as is the year in which transformative AI is being developed, etc. Whether or not identifying the relevant properties, or the probability measure we’re adopting, is harder than for typical short-term cases—and maybe prohibitively hard—is an interesting and important question. But it’s an empirical question, not one we should expect to answer by appealing to mathematical considerations around the cardinality or measurability of certain sets.
Separately, there may be an interesting question about how I’m able to identify the high-level properties I’m reasoning about—whether that high-level property is the number of people coming to my party or the number of people living in a trillion years. How do I know I “should pay attention” only to the number of party guests and not which natural numbers they may be shouting? And how am I able to “bridge” between more low-level descriptions of futures (e.g. a list of specific people coming to the party, or a video of the party, or even a set of initial conditions plus laws of motion for all relevant elementary particles)? There may be interesting questions here, but I think these are questions for philosophy or psychology who in my view aren’t particularly illuminated by referring to concepts from measure theory. (And again, they aren’t specific to longtermism.)
Technical comments on type-1 arguments (those aiming to show there can be no probability measure). [Refer to the parent comment for the distinction between type 1 and type 2 arguments.]
I basically don’t see how such an argument could work. Apologies if that’s totally clear to you and you were just trying to make a type-2 argument. However, I worry that some readers might come away with the impression that there is a viable argument of type 1 since Vaden and you mention issues of measurability and infinite cardinality. These relate to actual mathematical results showing that for certain sets, measures with certain properties can’t exist at all.
However, I don’t think this is relevant to the case you describe. And I also don’t think it can be salvaged for an argument against longtermism.
First, in what sense can sets be “immeasurable”? The issue can arise in the following situation. Suppose we have some set (in this context “sample space”—think of the elements at all possible instances of things that can happen at the most fine-grained level), and some measure (in this context “probability”—but it could also refer to something we’d intuitively call length or volume) we would like to assign to some subsets (the subsets in this context are “events”—e.g. the event that Santa Clause enters my room now is represented by the subset containing all instances with that property).
In this situation, it can happen that there is no way to extend this measure to all subsets.
The classic example here is the real line as base set. We would like a measure that assigns measure |a−b| to each interval [a,b] (the set of real numbers from a to b), thus corresponding to our intuitive notion of length. E.g. the interval [−1,3] should have length 4.
However, it turns out that there is no measure that assigns each interval its length and ‘works’ for all subsets of the real numbers. I.e. each way of extending the assignment to all subsets of the real line would violate one of the properties we want measures to have (e.g. the measure of an at most countable disjoint union of sets should be the sum of the measures of the individual sets).
Thus we have to limit ourselves to assigning a measure to only some subsets. (In technical terms: we have to use a σ-algebra that’s strictly smaller than the full set of all subsets.) In other words, there are some subsets the measure of which we have to leave undefined. Those are immeasurable sets.
Second, why don’t I think this will be a problem in this context?
At the highest level, note that even if we are in a context with immeasurable sets this does not mean that we get no (probability) measure at all. It just means that the measure won’t “work” for all subsets/events. So for this to be an objection to longtermism, we would need a further argument for why specific events we care about are immeasurable—or in other words, why we can’t simply limit ourselves to the set of measurable events.
Note that immeasurable sets, to the extent that we can describe them concretely at all, are usually highly ‘weird’. If you try to google for pictures of standard examples like Vitali sets you won’t find a single one because we essentially can’t visualize them. Indeed, by design every set that we can construct from intervals by countably many standard operations like intersections and unions is measurable. So at least in the case of the real numbers, we arguably won’t encounter immeasurable sets “in practice”.
Note also that the phenomenon of immeasurable sets enables a number of counterintuitive results, such as the Banach-Tarski theorem. Loosely speaking this theorem says we can cut up a ball into pieces, and then by moving around those pieces and reassembling them get a ball that has twice the volume of the original ball; so for example “a pea can be chopped up and reassembled into the Sun”.
But usually the conclusion we draw from this is not that it’s meaningless to use numbers to refer to the coordinates of objects in space, or that our notion of volume is meaningless and that “we cannot measure the volume of objects” (and to the extent there is a problem it doesn’t exclusively apply to particularly large objects—just as any problem relevant to predicting the future wouldn’t specifically apply to longtermism). At most, we might wonder whether our model of space as continuous in real-number coordinates “breaks down” in certain edge cases, but we don’t think that this invalidates pragmatic uses of this model that never use its full power (in terms of logical implications).
Immeasurable subsets are a phenomenon intimately tied to uncountable sets—i.e. ones that are even “larger” than the natural numbers (for instance, the real numbers are uncountable, but the rational numbers are not). This is roughly because the relevant concepts like σ-algebras and measures are defined in terms of countably many operations like unions or sums; and if you “fix” the measure of some sets in a way that’s consistent at all, then you can uniquely extend this to all sets you can get from those by taking complements and countable intersections and unions. In particular, if in a countable set you fix the measure of all singleton sets containing just one element, then this defines a unique measure on the set of all subsets.
Your examples of possible futures where people shout different natural numbers involve only countable sets. So it’s hard to see how we’d get any problem with immeasurable sets there.
You might be tempted to modify the example to argue that the set of possible futures is uncountably infinite because it contains people shouting all real numbers. However, (i) it’s not clear if it’s possible for people to shout any real number, (ii) even if it is then all my other remarks still apply, so I think this wouldn’t be a problem, certainly none specific to longtermism.
Regarding (i), the problem is that there is no general way to refer to an arbitrary real number within a finite window of time. In particular, I cannot “shout” an infinite and non-period decimal expansion; nor can I “shout” a sequence of rational numbers that converges to the real number I want to refer to (except maybe in a few cases where the sequence is a closed-form function of n).
More generally, if utterances are individuated by the finite sequence of words I’m using, then (assuming a finite alphabet) there are only countably many possible utterances I can make. If that’s right then I cannot refer to an arbitrary real number precisely because there are “too many” of them.
Similarly, the set of all sequences of black or white balls is uncountable, but it’s unclear whether we should think that it’s contained in the set of all possible futures.
More importantly: if there were serious problems due to immeasurable sets—whether with longtermism or elsewhere—we could retreat to reasoning about a countable subset. For instance, if I’m worried that predicting the development of transformative AI is problematic because “time from now” is measured in real numbers, I could simply limit myself to only reasoning about rational numbers of (e.g.) seconds from now.
There may be legitimate arguments for this response being ‘ad hoc’ or otherwise problematic. (E.g. perhaps I would want to use properties of rational numbers that can only be proven by using real numbers “within the proof”.) But especially given the large practical utility of reasoning about e.g. volumes of space or probabilities of future events, I think it at least shows that immeasurability can’t ground a decisive knock-down argument.
As even more of an aside, type 1 arguments would also be vulnerable to a variant of Owen’s objection that they “prove too little”.
However, rather than the argument depending too much on contingent properties of the world (e.g. whether it’s spatially infinite), the issue here is that they would depend on the axiomatization of mathematics.
The situation is roughly as follows: There are two different axiomatizations of mathematics with the following properties:
In both of them all maths that any of us are likely to ever “use in practice” works basically the same way.
For parallel situations (i.e. assignments of measure to some subsets of some set, which we’d like to extend to a measure on all subsets) there are immeasurable subsets in exactly one of the axiomatizations.
Specifically, for example, for our intuitive notion of “length” there are immeasurable subsets of the real numbers in the standard axiomatization of mathematics (called ZFC here). However, if we omit a single axiom—the axiom of choice—and replace it with an axiom that loosely says that there are weirdly large sets then every subset of the real numbers is measurable. [ETA: Actually it’s a bit more complicated, but I don’t think in a way that matters here. It doesn’t follow directly from these other axioms that everything is measurable, but using these axioms it’s possible to construct a “model of mathematics” in which that holds. Even less importantly, we don’t totally omit the axiom of choice but replace it with a weaker version.]
I think it would be pretty strange if the viability of longtermism depended on such considerations. E.g. imagine writing a letter to people in 1 million years explaining why you didn’t choose to try to help more rather than fewer of them. Or imagine getting such a letter from the distant past. I think I’d be pretty annoyed if I read “we considered helping you, but then we couldn’t decide between the axiom of choice and inaccessible cardinals …”.
Technical comments on type-2 arguments (i.e. those that aim to show there is no, or no non-arbitrary way for us to identify a particular probability measure.) [Refer to the parent comment for the distinction between type 1 and type 2 arguments.]
I think this is closer to the argument Vaden was aiming to make despite the somewhat nonstandard use of “measurable” (cf. my comment on type 1 arguments for what measurable vs. immeasurable usually refers to in maths), largely because of this part (emphasis mine) [ETA: Vaden also confirms this in this comment, which I hadn’t seen before writing my comments]:
Some comments:
Yes, we need to be more careful when reasoning about infinite sets since some of our intuitions only apply to finite sets. Vaden’s ball reshuffling example and the “Hilbert’s hotel” thought experiment they mention are two good examples for this.
However, the ball example only shows that one way of specifying a measure no longer works for infinite sample spaces: we can no longer get a measure by counting how many instances a subset (think “event”) consists of and dividing this by the number of all possible samples because doing so might amount to dividing infinity by infinity.
(We can still get a measure by simply setting the measure of any infinite subset to infinity, which is permitted for general measures, and treating something finite divided by infinity as 0. However, that way the full infinite sample space has measure infinity rather than 1, and thus we can’t interpret this measure as probability.)
But this need not be problematic. There are a lot of other ways for specifying measures, for both finite and infinite sets. In particular, we don’t have to rely on some ‘mathematical structure’ on the set we’re considering (as in the examples of real numbers that Vaden is giving) or other a priori considerations; when using probabilities for practical purposes, our reasons for using a particular measure will often be tied to empirical information.
For example, suppose I have a coin in my pocket, and I have empirical reasons (perhaps based on past observations, or perhaps I’ve seen how the coin was made) to think that a flip of that coin results in heads with probability 60% and tails with probability 40%. When reasoning about this formally, I might write down {H, T} as sample space, the set of all subsets as σ-algebra, and the unique measure μ with μ({H})=0.6.
But this is not because there is any general sense in which the set {H,T} is more “measurable” than the set of all sequences of black or white balls. Without additional (e.g. empirical) context, there is no non-arbitrary way to specify a measure on either set. And with suitable context, there will often be a ‘natural’ or ‘unique’ measure for either because the arbitrariness is defeated by the context.
This works just as well when I have no “objective” empirical data. I might simply have a gut feeling that the probability of heads is 60%, and be willing to e.g. accept bets corresponding to that belief. Someone might think that that’s foolish if I don’t have any objective data and thus bet against me. But it would be a pretty strange objection to say that me giving a probability of 60% is meaningless, or that I’m somehow not able or not allowed to enter such bets.
This works just as well for infinite sample spaces. For example, I might have a single radioactive atom in front of me, and ask myself when it will decay. For instance, I might want to know the probability that this atom will decay within the next 10 minutes. I won’t be deterred by the observation that I can’t get this probability by counting the number of “points in time” in the next 10 minutes and divide them by the total number of points in time. (Nor should I use ‘length’ as derived from the structure of the real numbers, and divide 10 by infinity to conclude that the probability is zero.) I will use an exponential distribution—a probability distribution on the real numbers which, in this context, is non-arbitrary: I have good reasons to use it and not some other distribution.
Note that even if we could get the probability by counting it would be the wrong one because the probability that the atom decays isn’t uniform. Similarly, if I have reasons to think that my coin is biased, I shouldn’t calculate probabilities by naive counting using the set {H,T}. Overall, I struggle to see how the availability of a counting measure is important to the question whether we can identify a “natural” or “unique” measure.
More generally, we manage to identify particular probability measures to use on both finite and infinite sample spaces all the time, basically any time we use statistics for real-world applications. And this is not because we’re dealing with particularly “measurable” or otherwise mathematically special sample spaces, and despite the fact that there are lots of possible probability measures that we could use.
Again, I do think there may be interesting questions here: How do we manage to do this? But again, I think these are questions for psychology or philosophy that don’t have to do with the cardinality or measurability of sets.
Similarly, I think that looking at statistical practice suggests that your challenge of “can you write down the measure space?” is a distraction rather than pointing to a substantial problem. In practice we often treat particular probability distributions as fundamental (e.g. we’re assuming that something is normally distributed with certain parameters) without “looking under the hood” at the set-theoretic description of random variables. For any given application where we want to use a particular distribution, there are arbitrarily many ways to write down a measure space and a random variable having that distribution; but usually we only care about the distribution and not these more fundamental details, and so aren’t worried by any “non-uniqueness” problem.
The most viable anti-longtermist argument I could see in the vicinity would be roughly as follows:
Argue that there is some relevant contingent (rather than e.g. mathematical) difference between longtermist and garden-variety cases.
Probably one would try to appeal to something like the longtermist cases being more “complex” relative to our reasoning and computational capabilities.
One could also try an “argument from disagreement”: perhaps our use of probabilities when e.g. forecasting the number of guests to my Christmas party is justified simply by the fact that ~everyone agrees how to do this. By contrast, in longtermist cases, maybe we can’t get such agreement.
Argue that this difference makes a difference for whether we’re justified to use subjective probabilities or expected values, or whatever the target of the criticism is supposed to be.
But crucially, I think mathematical features of the objects we’re dealing with when talking about common practices in a formal language are not where we can hope to find support for such an argument. This is because the longtermist and garden-variety cases don’t actually differ relevantly regarding these features.
Instead, I think the part we’d need to understand is not why there might be a challenge, but how and why in garden-variety cases we’re able to overcome that challenge. Only then can we assess whether these—or other—“methods” are also available to the longtermist.
Hi Max! Again, I agree the longtermist and garden-variety cases may not actually differ regarding the measure-theoretic features in Vaden’s post, but some additional comments here.
Although “probability of 60%” may be less meaningful than we’d like / expect, you are certainly allowed to enter such bets. In fact, someone willing to take the other side suggests that he/she disagrees. This highlights the difficulty of converging on objective probabilities for future outcomes which aren’t directly subject to domain-specific science (e.g. laws of planetary motion). Closer in time, we might converge reasonably closely on an unambiguous measure, or appropriate parametric statistical model.
Regarding the “60% probability” for future outcomes, a useful thought experiment for me was how I might reason about the risk profile of bets made on open-ended future outcomes. I quickly become less convinced I’m estimating meaningful risk the further out I go. Further, we only run the future once, so it’s hard to actually confirm our probability is meaningful (as for repeated coin flips). We could make longtermist bets by transferring $ btwn our far-future offspring, but can’t tell who comes out on top “in expectation” beyond simple arbitrages.
Honest question being new to EA… is it not problematic to restrict our attention to possible futures or aspects of futures which are relevant to a single issue at a time? Shouldn’t we calculate Expected Utility over billion year futures for all current interventions, and set our relative propensity for actions = exp{α * EU } / normalizer ?
For example, the downstream effects of donating to Anti-Malaria would be difficult to reason about, but we are clueless as to whether its EU would be dwarfed by AI safety on the billion yr timescale, e.g. bringing the entire world out of poverty limiting political risk leading to totalitarian government.
Yes, I agree that it’s problematic. We “should” do the full calculation if we could, but in fact we can’t because of our limited capacity for computation/thinking.
But note that in principle this situation is familiar. E.g. a CEO might try to maximize the long-run profits of her company, or a member of government might try to design a healthcare policy that maximizes wellbeing. In none of these cases are we able to do the “full calculation”, albeit my a less dramatic margin than for longtermism.
And we don’t think that the CEO’s or the politician’s effort are meaningless or doomed or anything like that. We know that they’ll use heuristics, simplified models, or other computational shortcuts; we might disagree with them which heuristics and models to use, and if repeatedly queried with “why?” both they and we would come to a place where we’d struggle to justify some judgment call or choice of prior or whatever. But that’s life—a familiar situation and one we can’t get out of.