[EDIT 1: The following is wrong re the 1.6% number, because that was the geometric mean of odds, not the geometric mean of probabilities as I assumed here.]
By the way, as the number of samples you take goes to infinity, I think the geometric mean of the sampled probabilities converges (in probability) to a limit which has a simple form in terms of the data. (After taking the log, this should just be a consequence of the law of large numbers.) Namely, it converges to the geometric mean of all the products of numbers from individual predictions! So instead of getting 1.6% from the sampling, I think you could have multiplied 42 numbers you calculated to find the number this 1.6% would converge to in the limit as the number of samples goes to infinity. I.e., what I have in mind are the numbers that were averaged to get the 18.7% number below. [EDIT 2: That’s not quite true, because the 18.7% was not the average of the products, but instead the product of the averages.]
Could you compute this number? Or feel free to let me know if I’m missing something. I’m also happy to elaborate further on the argument for convergence I have in mind.
I’m not completely sure I understand your request. The screenshot below is the Excel file with the survey results in. Column U is the product of columns N to S. You’d like the geometric mean of odds of column U? This is 0.023, which is approximately 2.3%. This isn’t quite the same as the estimate in my model, I think because there is some missing survey data which isn’t carried over into the model
Thanks! That’s indeed the quantity I was interested in, modulo me incorrectly thinking that you computed the geometric mean of probabilities and not odds.
Given that you used odds when computing the geometric mean, I retract my earlier claim that there is such a simple closed-form limit as the number of samples goes to infinity. Thanks for the clarification!
Here is another claim along similar lines: in the limit as the number of samples goes to infinity, I think the arithmetic mean of your sampled probabilities (currently reported as 9.65%) should converge (in probability) to the product of the arithmetic means of the probabilities respondents gave for each subquestion. So at least for finding this probability, I think one need not have done any sampling.
If you’d like to test this claim, you could recompute the numbers in the first column below with the arithmetic mean of the probabilities replacing the geometric mean of the odds, and find what the 18.7% product becomes.
Hope I’ve understood you right! I’ve taken the arithmetic mean of all columns and then computed the product of those arithmetic means. I end up with 9.74%. Again, I think this is slightly different from my model’s estimate of the value because the survey has some missing data which doesn’t occur in the synthetic distribution of the model
[EDIT 1: The following is wrong re the 1.6% number, because that was the geometric mean of odds, not the geometric mean of probabilities as I assumed here.]
By the way, as the number of samples you take goes to infinity, I think the geometric mean of the sampled probabilities converges (in probability) to a limit which has a simple form in terms of the data. (After taking the log, this should just be a consequence of the law of large numbers.) Namely, it converges to the geometric mean of all the products of numbers from individual predictions! So instead of getting 1.6% from the sampling, I think you could have multiplied 42 numbers you calculated to find the number this 1.6% would converge to in the limit as the number of samples goes to infinity. I.e., what I have in mind are the numbers that were averaged to get the 18.7% number below. [EDIT 2: That’s not quite true, because the 18.7% was not the average of the products, but instead the product of the averages.]
Could you compute this number? Or feel free to let me know if I’m missing something. I’m also happy to elaborate further on the argument for convergence I have in mind.
I’m not completely sure I understand your request. The screenshot below is the Excel file with the survey results in. Column U is the product of columns N to S. You’d like the geometric mean of odds of column U? This is 0.023, which is approximately 2.3%. This isn’t quite the same as the estimate in my model, I think because there is some missing survey data which isn’t carried over into the model
Thanks! That’s indeed the quantity I was interested in, modulo me incorrectly thinking that you computed the geometric mean of probabilities and not odds.
Given that you used odds when computing the geometric mean, I retract my earlier claim that there is such a simple closed-form limit as the number of samples goes to infinity. Thanks for the clarification!
Here is another claim along similar lines: in the limit as the number of samples goes to infinity, I think the arithmetic mean of your sampled probabilities (currently reported as 9.65%) should converge (in probability) to the product of the arithmetic means of the probabilities respondents gave for each subquestion. So at least for finding this probability, I think one need not have done any sampling.
If you’d like to test this claim, you could recompute the numbers in the first column below with the arithmetic mean of the probabilities replacing the geometric mean of the odds, and find what the 18.7% product becomes.
Hope I’ve understood you right! I’ve taken the arithmetic mean of all columns and then computed the product of those arithmetic means. I end up with 9.74%. Again, I think this is slightly different from my model’s estimate of the value because the survey has some missing data which doesn’t occur in the synthetic distribution of the model
Thanks, this is great!