Note that both of these can be relevant at the same time. e.g. suppose two surveyors estimated the chance your AirB&B will collapse each night and came back with 50% and 0.00000000001%. In that case, the geometric mean approach says it is fine, but really you shouldn’t stay there tonight. However simultaneously, expected number of nights it will last without collapsing is very high.
This example weakens the case for the arithmetic mean.
First let me establish: both of the surveyors’ estimates are virtually impossible for anything listed on AirBnB. They must be fabricated, hallucinated, trolled, drunken, parasitically-motivated, wildly uncalibrated, or 2 simultaneous typos.
Even buildings that are considered structurally unsound often end up standing for years anyway, and 50% just isn’t plausible except for some extraordinary circumstances. 50% over the next 24-hour period is reasonable if the building looks like this.
And as for 0.00000000001%, this is permitted by physics but that’s the strongest endorsement I can give. This implies that after 100 million years, or 36,500,000,000 days, there would still have only been a 30.58% chance of a collapse. It’s a reasonable guess if the interior of the building is entirely filled with a very stable material, and the outside is encased in bedrock 100m below the surface, in a geologically-quiet area.
You advise the reader:
but really you shouldn’t stay there tonight. However simultaneously, expected number of nights it will last without collapsing is very high.
This seems either contradictory, or needs elaboration. You show the correct intuition by suggesting the real probability is much lower, and in all likelihood, the building will probably do the mundane thing they usually do: stand uncollapsed for years to come. I wouldn’t move in to start a family there, but I’m not worried if some kids camp in there for a few nights either.
So imagine giving it the arithmetic mean answer of ~25%. That is almost impossible for anything listed on AirBnb. Now I am poor at doing calculations, but I think the geometric mean is 0.00022361%. If true, then after 1,000 years it would give a chance of collapse of 55.79%. This is plausible for some kinds of real-world buildings. Personally I would expect a higher percent as most buildings aren’t designed to last that long, or would be deliberately demolished (and therefore “collapse”) before then. But hey, it’s a plausible forecast for many actual buildings.
One factor in all this is that geometric mean aggregation makes more sense when there are proper-scoring incentives to be accurate, e.g. log-scoring is used. That is, being wrong at 99.999999% confidence should totally ruin your whole track record and you would lose whatever forecaster-prestige you could’ve had. That’s a social system where you can take extreme predictions more seriously. But in untracked setups where people can just giving one-off numbers that aren’t scored, and no particular real incentive to give an accurate forecast, then it’s more plausible the arithmetic mean of probabilities ends up being superior in some cases. But even then, there are notable cases where it will be wildly off, such as the surveyor example you gave.
You raise valid points, e.g. how geomean could give terrible results under some conditions. Like if someone says “Yeah I think the probability is 1/Tree(3) man.” and the whole thing is ruined. That is a valuable point and reasonable, and there may be some domains or prestige game setups where geomean would be broken by some yahoo giving a wild estimate. However I don’t condone a meta-approach where you say “My aggregation method says 25%, which I’m even acknowledging can’t be right, but you should act as if it could be”. Might as well act as it’s nonsense and just assume the base rate for AirBnB collapses.
Now if one of the surveyors made money or prestige by telling people they should worry about buildings collapsing, they may prefer the arithmetic mean in this case. I can’t vouch for the surveyors. But as a forecaster, I would do some checks against history, and conclude the number is a drastic overestimate. Far more likely that the 50%-giving surveyor is either trolling, confused, or they are selling me travel insurance or something. And in the end, I would defer to empirical results, for example in SimonM’s great comment, and question series.
This example weakens the case for the arithmetic mean.
First let me establish: both of the surveyors’ estimates are virtually impossible for anything listed on AirBnB. They must be fabricated, hallucinated, trolled, drunken, parasitically-motivated, wildly uncalibrated, or 2 simultaneous typos.
Even buildings that are considered structurally unsound often end up standing for years anyway, and 50% just isn’t plausible except for some extraordinary circumstances. 50% over the next 24-hour period is reasonable if the building looks like this.
And as for 0.00000000001%, this is permitted by physics but that’s the strongest endorsement I can give. This implies that after 100 million years, or 36,500,000,000 days, there would still have only been a 30.58% chance of a collapse. It’s a reasonable guess if the interior of the building is entirely filled with a very stable material, and the outside is encased in bedrock 100m below the surface, in a geologically-quiet area.
You advise the reader:
This seems either contradictory, or needs elaboration. You show the correct intuition by suggesting the real probability is much lower, and in all likelihood, the building will probably do the mundane thing they usually do: stand uncollapsed for years to come. I wouldn’t move in to start a family there, but I’m not worried if some kids camp in there for a few nights either.
So imagine giving it the arithmetic mean answer of ~25%. That is almost impossible for anything listed on AirBnb. Now I am poor at doing calculations, but I think the geometric mean is 0.00022361%. If true, then after 1,000 years it would give a chance of collapse of 55.79%. This is plausible for some kinds of real-world buildings. Personally I would expect a higher percent as most buildings aren’t designed to last that long, or would be deliberately demolished (and therefore “collapse”) before then. But hey, it’s a plausible forecast for many actual buildings.
One factor in all this is that geometric mean aggregation makes more sense when there are proper-scoring incentives to be accurate, e.g. log-scoring is used. That is, being wrong at 99.999999% confidence should totally ruin your whole track record and you would lose whatever forecaster-prestige you could’ve had. That’s a social system where you can take extreme predictions more seriously. But in untracked setups where people can just giving one-off numbers that aren’t scored, and no particular real incentive to give an accurate forecast, then it’s more plausible the arithmetic mean of probabilities ends up being superior in some cases. But even then, there are notable cases where it will be wildly off, such as the surveyor example you gave.
You raise valid points, e.g. how geomean could give terrible results under some conditions. Like if someone says “Yeah I think the probability is 1/Tree(3) man.” and the whole thing is ruined. That is a valuable point and reasonable, and there may be some domains or prestige game setups where geomean would be broken by some yahoo giving a wild estimate. However I don’t condone a meta-approach where you say “My aggregation method says 25%, which I’m even acknowledging can’t be right, but you should act as if it could be”. Might as well act as it’s nonsense and just assume the base rate for AirBnB collapses.
Now if one of the surveyors made money or prestige by telling people they should worry about buildings collapsing, they may prefer the arithmetic mean in this case. I can’t vouch for the surveyors. But as a forecaster, I would do some checks against history, and conclude the number is a drastic overestimate. Far more likely that the 50%-giving surveyor is either trolling, confused, or they are selling me travel insurance or something. And in the end, I would defer to empirical results, for example in SimonM’s great comment, and question series.