Technical comments on type-2 arguments (i.e. those that aim to show there is no, or no non-arbitrary way for us to identify a particular probability measure.) [Refer to the parent comment for the distinction between type 1 and type 2 arguments.]
I think this is closer to the argument Vaden was aiming to make despite the somewhat nonstandard use of “measurable” (cf. my comment on type 1 arguments for what measurable vs. immeasurable usually refers to in maths), largely because of this part (emphasis mine) [ETA: Vaden also confirms this in this comment, which I hadn’t seen before writing my comments]:
But don’t we apply probabilities to infinite sets all the time? Yes—to measurable sets. A measure provides a unique method of relating proportions of infinite sets to parts of itself, and this non-arbitrariness is what gives meaning to the notion of probability. While the interval between 0 and 1 has infinitely many real numbers, we know how these relate to each other, and to the real numbers between 1 and 2.
Some comments:
Yes, we need to be more careful when reasoning about infinite sets since some of our intuitions only apply to finite sets. Vaden’s ball reshuffling example and the “Hilbert’s hotel” thought experiment they mention are two good examples for this.
However, the ball example only shows that one way of specifying a measure no longer works for infinite sample spaces: we can no longer get a measure by counting how many instances a subset (think “event”) consists of and dividing this by the number of all possible samples because doing so might amount to dividing infinity by infinity.
But this need not be problematic. There are a lot of other ways for specifying measures, for both finite and infinite sets. In particular, we don’t have to rely on some ‘mathematical structure’ on the set we’re considering (as in the examples of real numbers that Vaden is giving) or other a priori considerations; when using probabilities for practical purposes, our reasons for using a particular measure will often be tied to empirical information.
For example, suppose I have a coin in my pocket, and I have empirical reasons (perhaps based on past observations, or perhaps I’ve seen how the coin was made) to think that a flip of that coin results in heads with probability 60% and tails with probability 40%. When reasoning about this formally, I might write down {H, T} as sample space, the set of all subsets as σ-algebra, and the unique measure μ with μ({H})=0.6.
But this is not because there is any general sense in which the set {H,T} is more “measurable” than the set of all sequences of black or white balls. Without additional (e.g. empirical) context, there is no non-arbitrary way to specify a measure on either set. And with suitable context, there will often be a ‘natural’ or ‘unique’ measure for either because the arbitrariness is defeated by the context.
This works just as well when I have no “objective” empirical data. I might simply have a gut feeling that the probability of heads is 60%, and be willing to e.g. accept bets corresponding to that belief. Someone might think that that’s foolish if I don’t have any objective data and thus bet against me. But it would be a pretty strange objection to say that me giving a probability of 60% is meaningless, or that I’m somehow not able or not allowed to enter such bets.
This works just as well for infinite sample spaces. For example, I might have a single radioactive atom in front of me, and ask myself when it will decay. For instance, I might want to know the probability that this atom will decay within the next 10 minutes. I won’t be deterred by the observation that I can’t get this probability by counting the number of “points in time” in the next 10 minutes and divide them by the total number of points in time. (Nor should I use ‘length’ as derived from the structure of the real numbers, and divide 10 by infinity to conclude that the probability is zero.) I will use an exponential distribution—a probability distribution on the real numbers which, in this context, is non-arbitrary: I have good reasons to use it and not some other distribution.
Note that even if we could get the probability by counting it would be the wrong one because the probability that the atom decays isn’t uniform. Similarly, if I have reasons to think that my coin is biased, I shouldn’t calculate probabilities by naive counting using the set {H,T}. Overall, I struggle to see how the availability of a counting measure is important to the question whether we can identify a “natural” or “unique” measure.
More generally, we manage to identify particular probability measures to use on both finite and infinite sample spaces all the time, basically any time we use statistics for real-world applications. And this is not because we’re dealing with particularly “measurable” or otherwise mathematically special sample spaces, and despite the fact that there are lots of possible probability measures that we could use.
Again, I do think there may be interesting questions here: How do we manage to do this? But again, I think these are questions for psychology or philosophy that don’t have to do with the cardinality or measurability of sets.
Similarly, I think that looking at statistical practice suggests that your challenge of “can you write down the measure space?” is a distraction rather than pointing to a substantial problem. In practice we often treat particular probability distributions as fundamental (e.g. we’re assuming that something is normally distributed with certain parameters) without “looking under the hood” at the set-theoretic description of random variables. For any given application where we want to use a particular distribution, there are arbitrarily many ways to write down a measure space and a random variable having that distribution; but usually we only care about the distribution and not these more fundamental details, and so aren’t worried by any “non-uniqueness” problem.
The most viable anti-longtermist argument I could see in the vicinity would be roughly as follows:
Argue that there is some relevant contingent (rather than e.g. mathematical) difference between longtermist and garden-variety cases.
Probably one would try to appeal to something like the longtermist cases being more “complex” relative to our reasoning and computational capabilities.
One could also try an “argument from disagreement”: perhaps our use of probabilities when e.g. forecasting the number of guests to my Christmas party is justified simply by the fact that ~everyone agrees how to do this. By contrast, in longtermist cases, maybe we can’t get such agreement.
Argue that this difference makes a difference for whether we’re justified to use subjective probabilities or expected values, or whatever the target of the criticism is supposed to be.
But crucially, I think mathematical features of the objects we’re dealing with when talking about common practices in a formal language are not where we can hope to find support for such an argument. This is because the longtermist and garden-variety cases don’t actually differ relevantly regarding these features.
Instead, I think the part we’d need to understand is not why there might be a challenge, but how and why in garden-variety cases we’re able to overcome that challenge. Only then can we assess whether these—or other—“methods” are also available to the longtermist.
Hi Max! Again, I agree the longtermist and garden-variety cases may not actually differ regarding the measure-theoretic features in Vaden’s post, but some additional comments here.
But it would be a pretty strange objection to say that me giving a probability of 60% is meaningless, or that I’m somehow not able or not allowed to enter such bets.
Although “probability of 60%” may be less meaningful than we’d like / expect, you are certainly allowed to enter such bets. In fact, someone willing to take the other side suggests that he/she disagrees. This highlights the difficulty of converging on objective probabilities for future outcomes which aren’t directly subject to domain-specific science (e.g. laws of planetary motion). Closer in time, we might converge reasonably closely on an unambiguous measure, or appropriate parametric statistical model.
Regarding the “60% probability” for future outcomes, a useful thought experiment for me was how I might reason about the risk profile of bets made on open-ended future outcomes. I quickly become less convinced I’m estimating meaningful risk the further out I go. Further, we only run the future once, so it’s hard to actually confirm our probability is meaningful (as for repeated coin flips). We could make longtermist bets by transferring $ btwn our far-future offspring, but can’t tell who comes out on top “in expectation” beyond simple arbitrages.
This defence is that for any instance of probabilistic reasoning about the future we can simply ignore most possible futures
Honest question being new to EA… is it not problematic to restrict our attention to possible futures or aspects of futures which are relevant to a single issue at a time? Shouldn’t we calculate Expected Utility over billion year futures for all current interventions, and set our relative propensity for actions = exp{α * EU } / normalizer ?
For example, the downstream effects of donating to Anti-Malaria would be difficult to reason about, but we are clueless as to whether its EU would be dwarfed by AI safety on the billion yr timescale, e.g. bringing the entire world out of poverty limiting political risk leading to totalitarian government.
Honest question being new to EA… is it not problematic to restrict our attention to possible futures or aspects of futures which are relevant to a single issue at a time? Shouldn’t we calculate Expected Utility over billion year futures for all current interventions, and set our relative propensity for actions = exp{α * EU } / normalizer ?
Yes, I agree that it’s problematic. We “should” do the full calculation if we could, but in fact we can’t because of our limited capacity for computation/thinking.
But note that in principle this situation is familiar. E.g. a CEO might try to maximize the long-run profits of her company, or a member of government might try to design a healthcare policy that maximizes wellbeing. In none of these cases are we able to do the “full calculation”, albeit my a less dramatic margin than for longtermism.
And we don’t think that the CEO’s or the politician’s effort are meaningless or doomed or anything like that. We know that they’ll use heuristics, simplified models, or other computational shortcuts; we might disagree with them which heuristics and models to use, and if repeatedly queried with “why?” both they and we would come to a place where we’d struggle to justify some judgment call or choice of prior or whatever. But that’s life—a familiar situation and one we can’t get out of.
Technical comments on type-2 arguments (i.e. those that aim to show there is no, or no non-arbitrary way for us to identify a particular probability measure.) [Refer to the parent comment for the distinction between type 1 and type 2 arguments.]
I think this is closer to the argument Vaden was aiming to make despite the somewhat nonstandard use of “measurable” (cf. my comment on type 1 arguments for what measurable vs. immeasurable usually refers to in maths), largely because of this part (emphasis mine) [ETA: Vaden also confirms this in this comment, which I hadn’t seen before writing my comments]:
Some comments:
Yes, we need to be more careful when reasoning about infinite sets since some of our intuitions only apply to finite sets. Vaden’s ball reshuffling example and the “Hilbert’s hotel” thought experiment they mention are two good examples for this.
However, the ball example only shows that one way of specifying a measure no longer works for infinite sample spaces: we can no longer get a measure by counting how many instances a subset (think “event”) consists of and dividing this by the number of all possible samples because doing so might amount to dividing infinity by infinity.
(We can still get a measure by simply setting the measure of any infinite subset to infinity, which is permitted for general measures, and treating something finite divided by infinity as 0. However, that way the full infinite sample space has measure infinity rather than 1, and thus we can’t interpret this measure as probability.)
But this need not be problematic. There are a lot of other ways for specifying measures, for both finite and infinite sets. In particular, we don’t have to rely on some ‘mathematical structure’ on the set we’re considering (as in the examples of real numbers that Vaden is giving) or other a priori considerations; when using probabilities for practical purposes, our reasons for using a particular measure will often be tied to empirical information.
For example, suppose I have a coin in my pocket, and I have empirical reasons (perhaps based on past observations, or perhaps I’ve seen how the coin was made) to think that a flip of that coin results in heads with probability 60% and tails with probability 40%. When reasoning about this formally, I might write down {H, T} as sample space, the set of all subsets as σ-algebra, and the unique measure μ with μ({H})=0.6.
But this is not because there is any general sense in which the set {H,T} is more “measurable” than the set of all sequences of black or white balls. Without additional (e.g. empirical) context, there is no non-arbitrary way to specify a measure on either set. And with suitable context, there will often be a ‘natural’ or ‘unique’ measure for either because the arbitrariness is defeated by the context.
This works just as well when I have no “objective” empirical data. I might simply have a gut feeling that the probability of heads is 60%, and be willing to e.g. accept bets corresponding to that belief. Someone might think that that’s foolish if I don’t have any objective data and thus bet against me. But it would be a pretty strange objection to say that me giving a probability of 60% is meaningless, or that I’m somehow not able or not allowed to enter such bets.
This works just as well for infinite sample spaces. For example, I might have a single radioactive atom in front of me, and ask myself when it will decay. For instance, I might want to know the probability that this atom will decay within the next 10 minutes. I won’t be deterred by the observation that I can’t get this probability by counting the number of “points in time” in the next 10 minutes and divide them by the total number of points in time. (Nor should I use ‘length’ as derived from the structure of the real numbers, and divide 10 by infinity to conclude that the probability is zero.) I will use an exponential distribution—a probability distribution on the real numbers which, in this context, is non-arbitrary: I have good reasons to use it and not some other distribution.
Note that even if we could get the probability by counting it would be the wrong one because the probability that the atom decays isn’t uniform. Similarly, if I have reasons to think that my coin is biased, I shouldn’t calculate probabilities by naive counting using the set {H,T}. Overall, I struggle to see how the availability of a counting measure is important to the question whether we can identify a “natural” or “unique” measure.
More generally, we manage to identify particular probability measures to use on both finite and infinite sample spaces all the time, basically any time we use statistics for real-world applications. And this is not because we’re dealing with particularly “measurable” or otherwise mathematically special sample spaces, and despite the fact that there are lots of possible probability measures that we could use.
Again, I do think there may be interesting questions here: How do we manage to do this? But again, I think these are questions for psychology or philosophy that don’t have to do with the cardinality or measurability of sets.
Similarly, I think that looking at statistical practice suggests that your challenge of “can you write down the measure space?” is a distraction rather than pointing to a substantial problem. In practice we often treat particular probability distributions as fundamental (e.g. we’re assuming that something is normally distributed with certain parameters) without “looking under the hood” at the set-theoretic description of random variables. For any given application where we want to use a particular distribution, there are arbitrarily many ways to write down a measure space and a random variable having that distribution; but usually we only care about the distribution and not these more fundamental details, and so aren’t worried by any “non-uniqueness” problem.
The most viable anti-longtermist argument I could see in the vicinity would be roughly as follows:
Argue that there is some relevant contingent (rather than e.g. mathematical) difference between longtermist and garden-variety cases.
Probably one would try to appeal to something like the longtermist cases being more “complex” relative to our reasoning and computational capabilities.
One could also try an “argument from disagreement”: perhaps our use of probabilities when e.g. forecasting the number of guests to my Christmas party is justified simply by the fact that ~everyone agrees how to do this. By contrast, in longtermist cases, maybe we can’t get such agreement.
Argue that this difference makes a difference for whether we’re justified to use subjective probabilities or expected values, or whatever the target of the criticism is supposed to be.
But crucially, I think mathematical features of the objects we’re dealing with when talking about common practices in a formal language are not where we can hope to find support for such an argument. This is because the longtermist and garden-variety cases don’t actually differ relevantly regarding these features.
Instead, I think the part we’d need to understand is not why there might be a challenge, but how and why in garden-variety cases we’re able to overcome that challenge. Only then can we assess whether these—or other—“methods” are also available to the longtermist.
Hi Max! Again, I agree the longtermist and garden-variety cases may not actually differ regarding the measure-theoretic features in Vaden’s post, but some additional comments here.
Although “probability of 60%” may be less meaningful than we’d like / expect, you are certainly allowed to enter such bets. In fact, someone willing to take the other side suggests that he/she disagrees. This highlights the difficulty of converging on objective probabilities for future outcomes which aren’t directly subject to domain-specific science (e.g. laws of planetary motion). Closer in time, we might converge reasonably closely on an unambiguous measure, or appropriate parametric statistical model.
Regarding the “60% probability” for future outcomes, a useful thought experiment for me was how I might reason about the risk profile of bets made on open-ended future outcomes. I quickly become less convinced I’m estimating meaningful risk the further out I go. Further, we only run the future once, so it’s hard to actually confirm our probability is meaningful (as for repeated coin flips). We could make longtermist bets by transferring $ btwn our far-future offspring, but can’t tell who comes out on top “in expectation” beyond simple arbitrages.
Honest question being new to EA… is it not problematic to restrict our attention to possible futures or aspects of futures which are relevant to a single issue at a time? Shouldn’t we calculate Expected Utility over billion year futures for all current interventions, and set our relative propensity for actions = exp{α * EU } / normalizer ?
For example, the downstream effects of donating to Anti-Malaria would be difficult to reason about, but we are clueless as to whether its EU would be dwarfed by AI safety on the billion yr timescale, e.g. bringing the entire world out of poverty limiting political risk leading to totalitarian government.
Yes, I agree that it’s problematic. We “should” do the full calculation if we could, but in fact we can’t because of our limited capacity for computation/thinking.
But note that in principle this situation is familiar. E.g. a CEO might try to maximize the long-run profits of her company, or a member of government might try to design a healthcare policy that maximizes wellbeing. In none of these cases are we able to do the “full calculation”, albeit my a less dramatic margin than for longtermism.
And we don’t think that the CEO’s or the politician’s effort are meaningless or doomed or anything like that. We know that they’ll use heuristics, simplified models, or other computational shortcuts; we might disagree with them which heuristics and models to use, and if repeatedly queried with “why?” both they and we would come to a place where we’d struggle to justify some judgment call or choice of prior or whatever. But that’s life—a familiar situation and one we can’t get out of.