In the particular example you propose, forecaster A assigns higher probability to X and Y and Z (0.7*0.7*0.7 = .343) than forecaster B (0.8*0.8*0.5 = 0.320). This seems intuitively correct.
Also, note that the squares are necessary to keep the scoring rule proper (the highest expected reward is obtained by reporting the true probability distribution), and this is in principle a crucial property (otherwise people could lie about what they think their probabilities are and get a better score). In particular, if you take out the square, then the “probability” which maximizes your expected score is either 0% or 100% (i.e., imagine that your probability was 60%, and just calculate the expected value of writing 60% vs 100% down).
An alternative to the Brier score which might interest you (or which you may have had in mind) is the logarithmic scoring rule, which in a sense tries to quantify how much information you add or substract from the aggregate. But it has other downsides, like being very harsh on mistakes. And it would also assign a worse score to forecaster B.
In the particular example you propose, forecaster A assigns higher probability to X and Y and Z (0.7*0.7*0.7 = .343) than forecaster B (0.8*0.8*0.5 = 0.320). This seems intuitively correct.
Also, note that the squares are necessary to keep the scoring rule proper (the highest expected reward is obtained by reporting the true probability distribution), and this is in principle a crucial property (otherwise people could lie about what they think their probabilities are and get a better score). In particular, if you take out the square, then the “probability” which maximizes your expected score is either 0% or 100% (i.e., imagine that your probability was 60%, and just calculate the expected value of writing 60% vs 100% down).
An alternative to the Brier score which might interest you (or which you may have had in mind) is the logarithmic scoring rule, which in a sense tries to quantify how much information you add or substract from the aggregate. But it has other downsides, like being very harsh on mistakes. And it would also assign a worse score to forecaster B.