That improvement of the Metaculus community prediction seems to be approximately logarithmic, meaning that doubling the number of forecasters seems to lead to a roughly constant (albeit probably diminishing) relative improvement in performance in terms of Brier Score: Going from 100 to 200 would give you a relative improvement in Brier score almost as large as when going from 10 to 20 (e.g. an improvement by x percent).
In some of the graphs it looks like the improvement diminishes more quickly than the logarithm, such that (e.g.) going from 100 to 200 gives a smaller improvement than going from 10 to 20. It seems like maybe you agree, given your “albeit probably diminishing” parenthetical. If so, could you rewrite this summary to better match that conclusion?
Maybe there’s some math that you could do that would provide a more precise mathematical description? e.g., With your bootstrapping analysis, is there a limit for the Brier score as the number of hypothetical users increases?
In some of the graphs it looks like the improvement diminishes more quickly than the logarithm, such that (e.g.) going from 100 to 200 gives a smaller improvement than going from 10 to 20. It seems like maybe you agree, given your “albeit probably diminishing” parenthetical. If so, could you rewrite this summary to better match that conclusion?
Maybe there’s some math that you could do that would provide a more precise mathematical description? e.g., With your bootstrapping analysis, is there a limit for the Brier score as the number of hypothetical users increases?