[Warning: this comment is kind of thinking-out-loud; the ideas are not yet distilled down to their best forms.]
The only thing I want to quibble about so far is your labelling my model as more general. I think it isn’t really—I had a bit of analysis based on the bivariate distribution, but really this was just a variation on the univariate distribution I mostly thought about.
Really the difference between our models is in the underlying distribution they assume. I was assuming something roughly (locally) log-uniform. You assume a Pareto distribution.
When is the one distribution a more reasonable assumption than the other? This is a question which is at the heart of things, and I expect to want to think more about. At a first pass I like your suggestive analysis that (something like) the Pareto distribution is appropriate when there are many many ways to spend money in ways that are a little effective but not very. I still feel drawn to the log-uniform model when thinking about the fundamental difficulty of finding important research breakthroughs. But perhaps something like Pareto ends up being correct if we think about opportunities to fund research? There could be lots and lots of opportunities to fund mediocre research (especially if you advertise that you’re willing to pay for it).
Actually the full version of this question should wrestle with needing to provide other distributions at times. In an efficient altruistic market all the best opportunities have been taken, so the top tier of remaining opportunities are all about equally good. Even if I dream up a new research area, it may to some extent funge against other types of research, so the distribution may be flatter than it would absent the work done already by the rest of the world. (This is something I’ve occasionally puzzled over for several years; I think your post could provide another helpful handhold for it.)
By the difference in generality i meant the difficulty-based problem selection. (Or the possibility of some other hidden variable that affects the order in which we solve problems.)
I was assuming something roughly (locally) log-uniform. You assume a Pareto distribution.
On a closer examination of your 2014 post, i don’t think this is true. If we look at the example distribution
Assume that an area has 100 problems, the first of difficulty 1, and each of difficulty 1.05 times the previous one. Assume for simplicity that they all have equal benefits.
and try to convert it to the language i’ve used in this post, there’s a trick with the scale density concept: Because the benefits of each problem are identical, their cost-effectiveness is the inverse of difficulty, yes. But the spacing of the problems along the cost-effectiveness axis decreases as the cost increases. So the scale density, which would be the cost divided by that spacing, ends up being proportional to the inverse square of cost-effectiveness. This is easier to understand in a spreadsheet. And the inverse square distribution is exactly where i would expect to see logarithmic returns to scale.
As for what distributions actually make sense in real life, i really don’t know. That’s more for people working in concrete cause areas to figure out than me sitting at home doing math. I’m just happy to provide a straightforward equation for those people to punch their more empirically-informed distributions into.
Of course you’re right; my “log uniform” assumption is in a different space than your “Pareto” assumption. I think I need to play around with the scale density notion a bit more until it’s properly intuitive.
Thanks! I think this is really helpful.
[Warning: this comment is kind of thinking-out-loud; the ideas are not yet distilled down to their best forms.]
The only thing I want to quibble about so far is your labelling my model as more general. I think it isn’t really—I had a bit of analysis based on the bivariate distribution, but really this was just a variation on the univariate distribution I mostly thought about.
Really the difference between our models is in the underlying distribution they assume. I was assuming something roughly (locally) log-uniform. You assume a Pareto distribution.
When is the one distribution a more reasonable assumption than the other? This is a question which is at the heart of things, and I expect to want to think more about. At a first pass I like your suggestive analysis that (something like) the Pareto distribution is appropriate when there are many many ways to spend money in ways that are a little effective but not very. I still feel drawn to the log-uniform model when thinking about the fundamental difficulty of finding important research breakthroughs. But perhaps something like Pareto ends up being correct if we think about opportunities to fund research? There could be lots and lots of opportunities to fund mediocre research (especially if you advertise that you’re willing to pay for it).
Actually the full version of this question should wrestle with needing to provide other distributions at times. In an efficient altruistic market all the best opportunities have been taken, so the top tier of remaining opportunities are all about equally good. Even if I dream up a new research area, it may to some extent funge against other types of research, so the distribution may be flatter than it would absent the work done already by the rest of the world. (This is something I’ve occasionally puzzled over for several years; I think your post could provide another helpful handhold for it.)
Howdy. I appreciate your reply.
By the difference in generality i meant the difficulty-based problem selection. (Or the possibility of some other hidden variable that affects the order in which we solve problems.)
On a closer examination of your 2014 post, i don’t think this is true. If we look at the example distribution
and try to convert it to the language i’ve used in this post, there’s a trick with the scale density concept: Because the benefits of each problem are identical, their cost-effectiveness is the inverse of difficulty, yes. But the spacing of the problems along the cost-effectiveness axis decreases as the cost increases. So the scale density, which would be the cost divided by that spacing, ends up being proportional to the inverse square of cost-effectiveness. This is easier to understand in a spreadsheet. And the inverse square distribution is exactly where i would expect to see logarithmic returns to scale.
As for what distributions actually make sense in real life, i really don’t know. That’s more for people working in concrete cause areas to figure out than me sitting at home doing math. I’m just happy to provide a straightforward equation for those people to punch their more empirically-informed distributions into.
Of course you’re right; my “log uniform” assumption is in a different space than your “Pareto” assumption. I think I need to play around with the scale density notion a bit more until it’s properly intuitive.