(and the same probabilities that not doing X leads to the opposite outcomes)
I’m not sure exactly what you mean by this, and I expect this will make it more complicated to think about than just giving utility differences with the counterfactual.
The idea of sensitivity to new information has been called credal resilience/credal fragility, but the problem I’m concerned with is having justified credences. I would often find it deeply unsatisfying (i.e. it seems unjustifiable) to represent my beliefs with a single probability distribution; I’d feel like I’m pulling numbers out of my ass, and I don’t think we should base important decisions on such numbers. So, I’d often rather give ranges for my probabilities. You literally can give single distributions/precise probabilities, but it seems unjustifiable, overconfident and silly.
If you haven’t already, I’d recommend reading the illustrative example here. I’d say it’s not actually justifiable to assign precisely 50-50 in that case or in almost any realistic situation that actually matters, because:
if you actually tried to build a model, it would be extraordinarily unlikely for you to get 50-50 unless you specifically pick your model parameters to get that result (which would be motivated reasoning and kind of defeat the purpose of building the model in the first place) or round the results, given that the evidence isn’t symmetric and you’d have multiple continuous parameters.
if you thought 50-50 was a good estimate before the evidential sweetening, then you can’t use 50-50 after, even though it seems just as appropriate for it. Furthermore, if you would have used 50-50 if originally presented with the sweetened information, then your beliefs depend on the timing/order in which you become aware of evidence (say you just miscounted witnesses the first time), which should be irrelevant and is incompatible with Bayesian rationality (unless you have specific reasons for dependence on the timing/order).
For the same reasons, in almost any realistic situation that actually matters, Alice in your example could not justifiably get 50-50. And in general, you shouldn’t get numbers with short exact decimal or fractional representations.
So, say in your example, it comes out 51.28… to 48.72..., but could have gone the other way under different reasonable parameter assignments; those are just the ones Alice happened to pickat that particular time. Maybe she also tells you it seems pretty arbitrary, and she could imagine having come up with the opposite conclusion and probabilities much further from 50-50 in each direction. And that she doesn’t have a best guess, because, again, it seems too arbitrary.
How would you respond if there isn’t enough time to investigate further? But you could instead support something that seems cost-effective without being so sensitive to pretty arbitrary parameter assignments, but not nearly as cost-effective as Alice’s intervention or an intervention doing the opposite.
Also imagine Bob gets around 47-53, and agrees with Alice about the arbitrariness and reasonable ranges. Furthermore, you can’t weigh Alice and Bob’s distributions evenly, because Alice has slightly more experience as a researcher and/or a slightly better score in forecasting, so you should give her estimate more weight.
The notion of 1/n probability breaks kind of down if you look an infinite number of scenarios or uncertainty values (if you talk about one particular uncertain variable). For example, let’s take population growth in economic models. Depending on your model and potential sensitivities to initial conditions, the resolution of this variable matters. For some context, the current population growth is at 1.1% per annum. But we might be uncertain about how this will develop in the future. Maybe 1.0%? Maybe 1.2%? Maybe that the resolution of 0.1% is enough. And this case, what range would feel comfortable to put a probability distribution over? [0.6, 1.5] maybe? So, that n=10 and with a uniform distribution, you get 1.4% population growth to be 10% likely? But what if minor changes are important? You end up with an infinite number of potential values – even if you restrict the range of possible values. How do we square this situation with the 1/n approach? I’m uncertain.
My other point is more a disclaimer. I’m not advocating for throwing out expected-utility thinking completely. And I’m still a Bayesian at heart (which sometimes means that I pull numbers out my behind^^). My point is that it is sometimes problematic to use a model, run it in a few configurations (i.e. for a few scenarios), calculate a weighted average of the outcomes and call it a day. This is especially problematic if we look at complex systems and models in which non-linearities are compounding quickly. If you have 10 uncertainty variables, each of them of type float with huge ranges of plausible values, how do you decide what scenarios (points in uncertainty space) to run? Posteriori weighted averaging likely fails to capture the complex interactions and the outcome distributions. What I’m trying to say is that I’m still going to assume probabilities and probability distributions in daily life. And I will still conduct expected utility calculations. However, when things get more complex (e.g. in model land), I might advocate for more caution.
I’m not sure I understand the concern with (1); I would first say that I think infinities are occasionally thrown around too lightly, and in this example it seems like it might be unjustified to say there are infinite possible values, especially since we are talking about units of people/population (which is composed of finite matter and discrete units). Moreover, the actual impact of a difference between 1.0000000000002% and 1.00000000000001% in most values seems unimportant for practical decision-making considerations—which, notably, are not made with infinite computation and data and action capabilities—even if it is theoretically possible to have such a difference. If something like that which seems so small is actually meaningful (e.g., it flips signs), however, then that might update you towards beliefs like “within analytical constraints the current analysis points to [balancing out |OR| one side being favored].” In other words, perhaps not pure uncertainty, since now you plausibly have some information that leans one way or another (with some caveats I won’t get into).
I think I would agree to some extent with (2). My main concern is mostly that I see people write things that (seemingly) make it sound like you just logically can’t do expected utility calculations when you face something like pure uncertainty; you just logically have to put a “?” in your models instead of “1/n,” which just breaks the whole model. Sometimes (like the examples I mentioned), the rest of the model is fine!
I contest that you can use “1/n”, it’s more just a matter of “should you do so given that you run the risk of misleading yourself or your audience towards X, Y, and Z failure modes (e.g., downplaying the value of doing further analysis, putting too many eggs in one basket/ignoring non-linear utility functions, creating bad epistemic cultures which disincentivize people from speaking out against overconfidence, …).”
In other words, I would prefer to see clearer disentangling of epistemic/logical claims from strategic/communication claims.
“While useful, even models that produced a perfect probability density function for precisely selected outcomes would not prove sufficient to answer such questions. Nor are they necessary.”
I recommend reading DMDU since it goes into much more detail than I can do justice.
Yet, I believe you are focusing heavily on the concept of the distribution existing while the claim should be restated.
Deep uncertainty implies that the range of reasonable distributions allows so many reasonable decisions that attempting to “agree on assumptions then act” is a poor frame. Instead, you want to explore all reasonable distributions then “agree on decisions”.
If you are in a state where reasonable people are producing meaningfully different decisions, ie different sign from your convention above, based on the distribution and weighting terms. Then it becomes more useful to focus on the timeline and tradeoffs rather than the current understanding of the distribution:
Explore the largest range of scenarios (in the 1/n case each time you add another plausible scenario it changes all scenario weights)
Understand the sequence of actions/information released
Identify actions that won’t change with new info
Identify information that will meaningfully change your decision
Identify actions that should follow given the new information
Quantify tradeoffs forced with decisions
This results is building an adapting policy pathway rather than making a decision or even choosing a model framework.
Value is derived from expanding the suite of policies, scenarios and objectives or illustrating the tradeoffs between objectives and how to minimize those tradeoffs via sequencing.
This is in contrast to emphasizing the optimal distribution (or worse, point estimate) conditional on all available data. Since that distribution is still subject to change in time and evaluated under different weights by different stakeholders.
I’m not sure exactly what you mean by this, and I expect this will make it more complicated to think about than just giving utility differences with the counterfactual.
I just added this in hastily to address any objection that says something like “What if I’m risk averse and prefer a 100% chance of getting 0 utility instead of an x% chance of getting very negative utility.” It would probably have been better to just say something like “ignore risk aversion and non-linear utility.”
I would often find it deeply unsatisfying (i.e. it seems unjustifiable) to represent my beliefs with a single probability distribution; I’d feel like I’m pulling numbers out of my ass, and I don’t think we should base important decisions on such numbers. So, I’d often rather give ranges for my probabilities. You literally can give single distributions/precise probabilities, but it seems unjustifiable, overconfident and silly.
I think this boils down to my point about the fear of miscommunicating—the questions like “how should I communicate my findings,” “what do my findings say about doing further analysis,” and “what are my findings current best-guess estimates.” If you think it goes beyond that—that it is actually “intrinsically incorrect-as-written,” I could write up a longer reply elaborating on the following: I’d pose the question back at you and ask whether it’s really justified or optimal to include ambiguity-laden “ranges” assuming there will be no miscommunication risks (e.g., nobody assumes “he said 57.61% so he must be very confident he’s right and doing more analysis won’t be useful”)? If you say “there’s a 1%-99% chance that a given coin will land on heads” because the coin is weighted but you don’t know whether it’s for heads or tails, how is this functionally any different from saying “my best guess is that on one flip the coin has a 50% chance of landing on heads”? (Again, I could elaborate further if needed)
if you actually tried to build a model, it would be extraordinarily unlikely for you to get 50-50
Sure, I agree. But that doesn’t change the decision in the example I gave, at least when you leave it at “upon further investigation it’s actually about 51-49.” In either case, the expected benefit-cost ratio is still roughly around 2:1. When facing analytical constraints and for this purely theoretical case, it seems optimal to do the 1/n estimate rather than “NaN” or “” or “???” which breaks your whole model and prevents you from calculating anything, so long as you’re setting aside all miscommunication risks (which was the main point of my comment: to try to disentangle miscommunication and related risks from the ability to use 1/n probabilities as a default optimal). To paraphrase what I said for a different comment, in the real world maybe it is better to just throw a wrench in the whole model and say “dear principal: no, stop, we need to disengage autopilot and think longer.” But I’m not at the real world yet, because I want to make sure I am clear on why I see so many people say things like you can’t give probability estimates for pure uncertainty (when in reality it seems nothing is certain anyway and thus you can’t give 100.0% “true” point or range estimates for anything).
I’m not sure exactly what you mean by this, and I expect this will make it more complicated to think about than just giving utility differences with the counterfactual.
The idea of sensitivity to new information has been called credal resilience/credal fragility, but the problem I’m concerned with is having justified credences. I would often find it deeply unsatisfying (i.e. it seems unjustifiable) to represent my beliefs with a single probability distribution; I’d feel like I’m pulling numbers out of my ass, and I don’t think we should base important decisions on such numbers. So, I’d often rather give ranges for my probabilities. You literally can give single distributions/precise probabilities, but it seems unjustifiable, overconfident and silly.
If you haven’t already, I’d recommend reading the illustrative example here. I’d say it’s not actually justifiable to assign precisely 50-50 in that case or in almost any realistic situation that actually matters, because:
if you actually tried to build a model, it would be extraordinarily unlikely for you to get 50-50 unless you specifically pick your model parameters to get that result (which would be motivated reasoning and kind of defeat the purpose of building the model in the first place) or round the results, given that the evidence isn’t symmetric and you’d have multiple continuous parameters.
if you thought 50-50 was a good estimate before the evidential sweetening, then you can’t use 50-50 after, even though it seems just as appropriate for it. Furthermore, if you would have used 50-50 if originally presented with the sweetened information, then your beliefs depend on the timing/order in which you become aware of evidence (say you just miscounted witnesses the first time), which should be irrelevant and is incompatible with Bayesian rationality (unless you have specific reasons for dependence on the timing/order).
For the same reasons, in almost any realistic situation that actually matters, Alice in your example could not justifiably get 50-50. And in general, you shouldn’t get numbers with short exact decimal or fractional representations.
So, say in your example, it comes out 51.28… to 48.72..., but could have gone the other way under different reasonable parameter assignments; those are just the ones Alice happened to pick at that particular time. Maybe she also tells you it seems pretty arbitrary, and she could imagine having come up with the opposite conclusion and probabilities much further from 50-50 in each direction. And that she doesn’t have a best guess, because, again, it seems too arbitrary.
How would you respond if there isn’t enough time to investigate further? But you could instead support something that seems cost-effective without being so sensitive to pretty arbitrary parameter assignments, but not nearly as cost-effective as Alice’s intervention or an intervention doing the opposite.
Also imagine Bob gets around 47-53, and agrees with Alice about the arbitrariness and reasonable ranges. Furthermore, you can’t weigh Alice and Bob’s distributions evenly, because Alice has slightly more experience as a researcher and/or a slightly better score in forecasting, so you should give her estimate more weight.
Great to see people digging into the crucial assumptions!
In my view, @MichaelStJules makes great counter points to @Harrison Durland’s objection. I would like to add to further points.
The notion of 1/n probability breaks kind of down if you look an infinite number of scenarios or uncertainty values (if you talk about one particular uncertain variable). For example, let’s take population growth in economic models. Depending on your model and potential sensitivities to initial conditions, the resolution of this variable matters. For some context, the current population growth is at 1.1% per annum. But we might be uncertain about how this will develop in the future. Maybe 1.0%? Maybe 1.2%? Maybe that the resolution of 0.1% is enough. And this case, what range would feel comfortable to put a probability distribution over? [0.6, 1.5] maybe? So, that n=10 and with a uniform distribution, you get 1.4% population growth to be 10% likely? But what if minor changes are important? You end up with an infinite number of potential values – even if you restrict the range of possible values. How do we square this situation with the 1/n approach? I’m uncertain.
My other point is more a disclaimer. I’m not advocating for throwing out expected-utility thinking completely. And I’m still a Bayesian at heart (which sometimes means that I pull numbers out my behind^^). My point is that it is sometimes problematic to use a model, run it in a few configurations (i.e. for a few scenarios), calculate a weighted average of the outcomes and call it a day. This is especially problematic if we look at complex systems and models in which non-linearities are compounding quickly. If you have 10 uncertainty variables, each of them of type float with huge ranges of plausible values, how do you decide what scenarios (points in uncertainty space) to run? Posteriori weighted averaging likely fails to capture the complex interactions and the outcome distributions. What I’m trying to say is that I’m still going to assume probabilities and probability distributions in daily life. And I will still conduct expected utility calculations. However, when things get more complex (e.g. in model land), I might advocate for more caution.
I’m not sure I understand the concern with (1); I would first say that I think infinities are occasionally thrown around too lightly, and in this example it seems like it might be unjustified to say there are infinite possible values, especially since we are talking about units of people/population (which is composed of finite matter and discrete units). Moreover, the actual impact of a difference between 1.0000000000002% and 1.00000000000001% in most values seems unimportant for practical decision-making considerations—which, notably, are not made with infinite computation and data and action capabilities—even if it is theoretically possible to have such a difference. If something like that which seems so small is actually meaningful (e.g., it flips signs), however, then that might update you towards beliefs like “within analytical constraints the current analysis points to [balancing out |OR| one side being favored].” In other words, perhaps not pure uncertainty, since now you plausibly have some information that leans one way or another (with some caveats I won’t get into).
I think I would agree to some extent with (2). My main concern is mostly that I see people write things that (seemingly) make it sound like you just logically can’t do expected utility calculations when you face something like pure uncertainty; you just logically have to put a “?” in your models instead of “1/n,” which just breaks the whole model. Sometimes (like the examples I mentioned), the rest of the model is fine!
I contest that you can use “1/n”, it’s more just a matter of “should you do so given that you run the risk of misleading yourself or your audience towards X, Y, and Z failure modes (e.g., downplaying the value of doing further analysis, putting too many eggs in one basket/ignoring non-linear utility functions, creating bad epistemic cultures which disincentivize people from speaking out against overconfidence, …).”
In other words, I would prefer to see clearer disentangling of epistemic/logical claims from strategic/communication claims.
“While useful, even models that produced a perfect probability density function for precisely selected outcomes would not prove sufficient to answer such questions. Nor are they necessary.”
I recommend reading DMDU since it goes into much more detail than I can do justice.
Yet, I believe you are focusing heavily on the concept of the distribution existing while the claim should be restated.
Deep uncertainty implies that the range of reasonable distributions allows so many reasonable decisions that attempting to “agree on assumptions then act” is a poor frame. Instead, you want to explore all reasonable distributions then “agree on decisions”.
If you are in a state where reasonable people are producing meaningfully different decisions, ie different sign from your convention above, based on the distribution and weighting terms. Then it becomes more useful to focus on the timeline and tradeoffs rather than the current understanding of the distribution:
Explore the largest range of scenarios (in the 1/n case each time you add another plausible scenario it changes all scenario weights)
Understand the sequence of actions/information released
Identify actions that won’t change with new info
Identify information that will meaningfully change your decision
Identify actions that should follow given the new information
Quantify tradeoffs forced with decisions
This results is building an adapting policy pathway rather than making a decision or even choosing a model framework.
Value is derived from expanding the suite of policies, scenarios and objectives or illustrating the tradeoffs between objectives and how to minimize those tradeoffs via sequencing.
This is in contrast to emphasizing the optimal distribution (or worse, point estimate) conditional on all available data. Since that distribution is still subject to change in time and evaluated under different weights by different stakeholders.
I just added this in hastily to address any objection that says something like “What if I’m risk averse and prefer a 100% chance of getting 0 utility instead of an x% chance of getting very negative utility.” It would probably have been better to just say something like “ignore risk aversion and non-linear utility.”
I think this boils down to my point about the fear of miscommunicating—the questions like “how should I communicate my findings,” “what do my findings say about doing further analysis,” and “what are my findings current best-guess estimates.” If you think it goes beyond that—that it is actually “intrinsically incorrect-as-written,” I could write up a longer reply elaborating on the following: I’d pose the question back at you and ask whether it’s really justified or optimal to include ambiguity-laden “ranges” assuming there will be no miscommunication risks (e.g., nobody assumes “he said 57.61% so he must be very confident he’s right and doing more analysis won’t be useful”)? If you say “there’s a 1%-99% chance that a given coin will land on heads” because the coin is weighted but you don’t know whether it’s for heads or tails, how is this functionally any different from saying “my best guess is that on one flip the coin has a 50% chance of landing on heads”? (Again, I could elaborate further if needed)
Sure, I agree. But that doesn’t change the decision in the example I gave, at least when you leave it at “upon further investigation it’s actually about 51-49.” In either case, the expected benefit-cost ratio is still roughly around 2:1. When facing analytical constraints and for this purely theoretical case, it seems optimal to do the 1/n estimate rather than “NaN” or “” or “???” which breaks your whole model and prevents you from calculating anything, so long as you’re setting aside all miscommunication risks (which was the main point of my comment: to try to disentangle miscommunication and related risks from the ability to use 1/n probabilities as a default optimal). To paraphrase what I said for a different comment, in the real world maybe it is better to just throw a wrench in the whole model and say “dear principal: no, stop, we need to disengage autopilot and think longer.” But I’m not at the real world yet, because I want to make sure I am clear on why I see so many people say things like you can’t give probability estimates for pure uncertainty (when in reality it seems nothing is certain anyway and thus you can’t give 100.0% “true” point or range estimates for anything).