I think the example Ben cites in his reply is very illustrative.
You might feel that you canât justify your one specific choice of prior over another prior, so that particular choice is arbitrary, and then what you should do could depend on this arbitrary choice, whereas an equally reasonable prior would recommend a different decision. Someone else could have exactly the same information as you, but due to a different psychology, or just different patterns of neurons firing, come up with a different prior that ends up recommending a different decision. Choosing one prior over another without reason seems like a whim or a bias, and potentially especially prone to systematic error.
It seems bad if weâre basing how to do the most good on whims and biases.
If youâre lucky enough to have only finitely many equally reasonable priors, then I think it does make sense to just use a uniform meta-prior over them, i.e. just take their average. This doesnât seem to work with infinitely many priors, since you could use different parametrizations to represent the same continuous family of distributions, with a different uniform distribution and therefore average for each parametrization. Youâd have to justify your choice of parametrization!
As another example, imagine you have a coin that someone (who is trustworthy) has told you is biased towards heads, but they havenât given you any hint how much, and you want to come up with a probability distribution for the fraction of heads over 1,000,000 flips. So, you want a distribution over the interval [0, 1]. Which distribution would you use? Say you give me a probability density function f. Why not (1âp)f(x)+p for some pâ(0,1)? Why not 1âŤ10f(xp)dxf(xp) for some p>0? If f is a weighted average of multiple distributions, why not apply one of these transformations to one of the component distributions and choose the resulting weighted average instead? Why the particular weights youâve chosen and not slightly different ones?
Which distribution would you use? Why the particular weights youâve chosen and not slightly different ones?
I think you just have to make your distribution uninformative enough that reasonable differences in the weights donât change your overall conclusion. If they do, then I would concede that the solution to your specific question really is clueless. Otherwise, you can probably find a response.
come up with a probability distribution for the fraction of heads over 1,000,000 flips.
Rather than thinking of directly of appropriate distribution for the 1,000,000 flips, Iâd think of a distribution to model p itself. Then you can run simulations based on the distribution of p to calculate the distribution of the fraction of 1000,000 flips. pâ(0.5,1.0], and then we need to select a distribution for p over that range.
There is no one correct probability distribution for p because any probability is just an expression of our belief, so you may use whatever probability distribution genuinely reflects your prior belief. A uniform distribution is a reasonable start. Perhaps you really are clueless about p, in which case, yes, thereâs a certain amount of subjectivity about your choice. But prior beliefs are always inherently subjective, because they simply describe your belief about the state of the world as you know it now. The fact you might have to select a distribution, or set of distributions with some weighted average, is merely an expression of your uncertainty. This in itself, I think, doesnât stop you from trying to estimate the result.
I think this expresses within Bayesian terms the philosophical idea that we can only make moral choices based on information available at the time; one canât be held morally responsible for mistakes made on the basis of the information we didnât have.
Perhaps you disagree with me that a uniform distribution is the best choice. You reason thus: âwe have some idea about the properties of coins in general. Itâs difficult to make a coin that is 100% biased towards heads. So that seems unlikelyâ. So we could pick a distribution that better reflects your prior belief. Perhaps a suitable choice might be Beta(2,2) with a truncation at 0.5, which will give the greatest likelihood of p just above 0.5, and a declining likelihood down to 1.0.
Maybe you and i just canât agree after all that there is still no consistent and reasonable prior choice you can make, and not even any compromise. And letâs say we both run simulations using our own priors and find entirely different results and we canât agree on any suitable weighting between them. In that case, yes, I can see you have cluelessness. I donât think it follows that, if we went through the same process for estimating the longtermist moral worth of malaria bednet distribution, we must have intractable complex cluelessness about specific problems like malaria bednet distribution. I think I can admit that perhaps, right now, in our current belief state, we are genuinely clueless, but it seems that there is some work that can be done that might eliminate the cluelessness.
It seems bad if weâre basing how to do the most good on whims and biases.
I agree. However, in cases where priors are playing a crucial role, one should simply prioritise gathering more evidence until there is reasonable convergence about what to do (among a given group of people, for a particular decision)?
In some cases, we canât gather strong enough evidence, say because:
theyâre questions about very speculative or unprecedented possibilities and the evidence would either be too indirect and weak or come too late to be very action-guiding, e.g. often for AI risk, conscious subsystems, or
there will be too much noise or confounding, too small a sample size and anything like an RCT is too impractical (e.g. policy, corporate outreach) or wouldnât generalize well, or
the disagreements are partly conceptual, definitional or philosophical, e.g. âWhat is consciousness?â, âWhat is the hedonic intensity of an experience?â
EDIT: generally, the window to intervene is too small to wait for the evidence.
In such cases, I think imprecise probabilities are the way to go to reduce arbitrariness. We can do sensitivity analysis. If whether the intervention looks good or bad overall depends highly on fairly arbitrary judgements or priors, we might disprefer it and prefer to support things that are more robustly positive. This is difference-making ambiguity aversion.
And/âor we do can some kind of bracketing.
Also, you should think of research as an intervention itself that could backfire. Who could use the research, and could they use it in ways youâd judge as very negative? How likely is that? This will of course depend on the case and your own specific views.
The reasons you mentioned for gathering strong evidence not being possible (or being very difficult) apply to some extent to efforts increasing human welfare, but humans have probably still made progress on increasing human welfare over the past 200 years or so? Can one be confident similar progress cannot be extended to non-humans?
I agree research can backfire. However, at least historically, doing research on the sentience of animals, and on how to increase their welfare has mostly been beneficial for the target animals?
I think the example Ben cites in his reply is very illustrative.
You might feel that you canât justify your one specific choice of prior over another prior, so that particular choice is arbitrary, and then what you should do could depend on this arbitrary choice, whereas an equally reasonable prior would recommend a different decision. Someone else could have exactly the same information as you, but due to a different psychology, or just different patterns of neurons firing, come up with a different prior that ends up recommending a different decision. Choosing one prior over another without reason seems like a whim or a bias, and potentially especially prone to systematic error.
It seems bad if weâre basing how to do the most good on whims and biases.
If youâre lucky enough to have only finitely many equally reasonable priors, then I think it does make sense to just use a uniform meta-prior over them, i.e. just take their average. This doesnât seem to work with infinitely many priors, since you could use different parametrizations to represent the same continuous family of distributions, with a different uniform distribution and therefore average for each parametrization. Youâd have to justify your choice of parametrization!
As another example, imagine you have a coin that someone (who is trustworthy) has told you is biased towards heads, but they havenât given you any hint how much, and you want to come up with a probability distribution for the fraction of heads over 1,000,000 flips. So, you want a distribution over the interval [0, 1]. Which distribution would you use? Say you give me a probability density function f. Why not (1âp)f(x)+p for some pâ(0,1)? Why not 1âŤ10f(xp)dxf(xp) for some p>0? If f is a weighted average of multiple distributions, why not apply one of these transformations to one of the component distributions and choose the resulting weighted average instead? Why the particular weights youâve chosen and not slightly different ones?
I think you just have to make your distribution uninformative enough that reasonable differences in the weights donât change your overall conclusion. If they do, then I would concede that the solution to your specific question really is clueless. Otherwise, you can probably find a response.
Rather than thinking of directly of appropriate distribution for the 1,000,000 flips, Iâd think of a distribution to model p itself. Then you can run simulations based on the distribution of p to calculate the distribution of the fraction of 1000,000 flips. pâ(0.5,1.0], and then we need to select a distribution for p over that range.
There is no one correct probability distribution for p because any probability is just an expression of our belief, so you may use whatever probability distribution genuinely reflects your prior belief. A uniform distribution is a reasonable start. Perhaps you really are clueless about p, in which case, yes, thereâs a certain amount of subjectivity about your choice. But prior beliefs are always inherently subjective, because they simply describe your belief about the state of the world as you know it now. The fact you might have to select a distribution, or set of distributions with some weighted average, is merely an expression of your uncertainty. This in itself, I think, doesnât stop you from trying to estimate the result.
I think this expresses within Bayesian terms the philosophical idea that we can only make moral choices based on information available at the time; one canât be held morally responsible for mistakes made on the basis of the information we didnât have.
Perhaps you disagree with me that a uniform distribution is the best choice. You reason thus: âwe have some idea about the properties of coins in general. Itâs difficult to make a coin that is 100% biased towards heads. So that seems unlikelyâ. So we could pick a distribution that better reflects your prior belief. Perhaps a suitable choice might be Beta(2,2) with a truncation at 0.5, which will give the greatest likelihood of p just above 0.5, and a declining likelihood down to 1.0.
Maybe you and i just canât agree after all that there is still no consistent and reasonable prior choice you can make, and not even any compromise. And letâs say we both run simulations using our own priors and find entirely different results and we canât agree on any suitable weighting between them. In that case, yes, I can see you have cluelessness. I donât think it follows that, if we went through the same process for estimating the longtermist moral worth of malaria bednet distribution, we must have intractable complex cluelessness about specific problems like malaria bednet distribution. I think I can admit that perhaps, right now, in our current belief state, we are genuinely clueless, but it seems that there is some work that can be done that might eliminate the cluelessness.
Hi Michael.
I agree. However, in cases where priors are playing a crucial role, one should simply prioritise gathering more evidence until there is reasonable convergence about what to do (among a given group of people, for a particular decision)?
In some cases, we canât gather strong enough evidence, say because:
theyâre questions about very speculative or unprecedented possibilities and the evidence would either be too indirect and weak or come too late to be very action-guiding, e.g. often for AI risk, conscious subsystems, or
there will be too much noise or confounding, too small a sample size and anything like an RCT is too impractical (e.g. policy, corporate outreach) or wouldnât generalize well, or
the disagreements are partly conceptual, definitional or philosophical, e.g. âWhat is consciousness?â, âWhat is the hedonic intensity of an experience?â
EDIT: generally, the window to intervene is too small to wait for the evidence.
In such cases, I think imprecise probabilities are the way to go to reduce arbitrariness. We can do sensitivity analysis. If whether the intervention looks good or bad overall depends highly on fairly arbitrary judgements or priors, we might disprefer it and prefer to support things that are more robustly positive. This is difference-making ambiguity aversion.
And/âor we do can some kind of bracketing.
Also, you should think of research as an intervention itself that could backfire. Who could use the research, and could they use it in ways youâd judge as very negative? How likely is that? This will of course depend on the case and your own specific views.
The reasons you mentioned for gathering strong evidence not being possible (or being very difficult) apply to some extent to efforts increasing human welfare, but humans have probably still made progress on increasing human welfare over the past 200 years or so? Can one be confident similar progress cannot be extended to non-humans?
I agree research can backfire. However, at least historically, doing research on the sentience of animals, and on how to increase their welfare has mostly been beneficial for the target animals?