Strong upvotedāI found this very interesting, both for various parts of the specific evaluations and more generally as an example of one way to do longtermist charity evaluation (which currently seems to be rarely done in anything beyond a cursory or solely qualitative way, at least in public writings).
One question I have is what you saw as the key purposes of this post. Some possibilities:
Inform decisions about donations that are each in something like the $10-$5000 dollar range
Inform decisions about donations/āgrants that are each in something like the >$50,000 dollar range
(Obviously Iām missing the $5,000-$50,000 range; I have a vague sense that the more interesting question is which of those two buckets I pointed to youāre more focused on, if either)
Inform decisions about which of these orgs (if any) to work for
Provide feedback to these orgs that causes them to improve
Provide an accountability mechanism for these orgs that causes them to work harder or smarter so that they look better on such evaluations in future
Just see if this sort of evaluation can be done, learn more about how to do that, and share that meta-level info with the EA public
[something else]
I ask partly because:
My sense is that, traditionally, public charity evaluations are mostly focused on informing decisions by individual donors giving non-huge sums each
But this seems somewhat less relevant in longtermism than in other EA cause areas
Longtermism seems somewhat less funding-constrained than other EA cause areas
In longtermism compared to in other areas, evaluation seems harder for various reasons, and so the case for giving to a donation lottery or a fund whose dollars are distributed by specialist grantmakers seems stronger
But I guess your post, or other things like it, could mitigate this
But I still think some of the bottlenecks arenāt addressed, e.g. I think nonpublic info is more often relevant for evaluating longtermist orgs than for evaluating animal welfare orgs
Also, you rarely talked about room for more funding, which seems to imply you werenāt focused primarily on informing donors?
(To be clear, āwhat did you see as the key purposes of this post?ā is a sincere rather than rhetorical question, and I think this post is great.)
(I work for two of the orgs discussed in this post and as a grantmaker for a fund, but this commentāas usualāexpresses personal views only.)
I actually think it would be cool to have more posts that explicitly discuss which organizations people should go work at (and what might make it a good personal fit for them).
Thanks Michael. Going through your options one by one.
Inform decisions about donations that are each in something like the $10-$5000 dollar range. Not an aim I had, but sure, why not.
Inform decisions about donations/āgrants that are each in something like the >$50,000 dollar range. So rather than inform those directly, inform the kind of research that you can either do or buy with money to inform that donation. $50,000 feels a little bit low for commissioning research to make a decision, though (could a $5k to $10k investment in a better version of this post make a $50k donation more than 10-20% better? Plausibly.
That said, Iād be curious if any largish donations are changed as a result of this post, and why, and in particular why they didnāt defer to the LTF fund.
Inform decisions about which of these orgs (if any) to work for. Not really for myself, but Iād be happy for people to read this post as part of their decisions. Also, 80,000 hours exists.
Provide feedback to these orgs that causes them to improve. Sure, but not a primary aim.
Provide an accountability mechanism for these orgs that causes them to work harder or smarter so that they look better on such evaluations in future. No, not really.
Just see if this sort of evaluation can be done, learn more about how to do that, and share that meta-level info with the EA public. Yep.
[Something else]. Show the kind of thing that an organization like QURI can do! In particular, you canāt do this kind of thing using software other than foretold (Metaculus is great, but the questions are too ambiguous; getting them approved takes time & in the case of a tournament, money, and for this post I only needed my own predictions (not that you canāt run a tournament on foretold.))
[Something else]. Learn more about the longtermist ecosystem myself
[Something else]. So this was sort of on the edges of this project, but for making large amounts of predictions, one does need a pipeline, and improving that pipeline has been on my mind (and on Ozzie Gooenās). For instance, creating the 27 predictions one by one would be kind of a pain, so instead I use a Google doc script which feeds them to foretold.
I also think that 4. and 5. are too strongly worded. To the extent Iām providing feedback, I imagine itās more of a) of the sanity check variety or b) about how a relatively sane person perceives these organizations. For instance, if I donāt get pushback about it in the comments, Iāll think that its a good idea for the APPGFG to expand, but I doubt itās something that they themselves havenāt thought about.
In an ideal world weād have intense evaluations of all organizations that are specific to all possible uses, done in communications styles relevant to all people.
Unfortunately this is an impossible amount of work, so we have to find some messy shortcuts that get much of the benefit at a decent cost.
Iām not sure how to best focus longtermist organization evaluations to maximize gains for a diversity of types of decisions. Fortunately I think whenever one makes an evaluation for one specific thing (funding decisions), these wind up relevant for other things (career decisions, organization decisions).
My primary interest at this point are evaluations of the following:
How much total impact is an organization having, positive or negative?
How can such impact be improved?
How efficient is the organization (in terms of money and talent)
How valuable is it to other groups or individuals to read /ā engage with the work of this organization? (Think Yelp or Amazon reviews)
My guess is that such investigations will help answer a wide assortment of different questions.
To echo what NuƱo said, some of my interest in this specific task was in attempting a fairly general-purpose attempt. I think that increasingly substantial attempts is a pretty good bet, because a whole lot could either go wrong (this work upsets some group or includes falsities) or new ideas could be figured out (particularly by commenters, such as those on this post).
In the longer term my preference isnāt for QURI/āNuƱo to be doing the majority of public evaluations of longtermist orgs, but instead for others to do most of this work. Perhaps this could be something of a standard blog post type, and/āor there could be 1-2 small organizations dedicated to it. I think it really should be done independently from other large orgs (to be less biased and more isolated), so it probably wouldnāt make sense for this work to be done as part of a much bigger organization.
Also, Iād agree that <$1Mil funding decisions arenāt the main thing Iām interested in. I think that talent and larger allocations are much more exciting.
For example, perhaps itās realized that one small nonprofitās work is much more valuable than expected, so future donors wind up spending $200Mil in related work down the line. Or, there are many systematic effects, like new founders are inspired by trends identified in the evaluations and make better new nonprofits because of it.
Strong upvotedāI found this very interesting, both for various parts of the specific evaluations and more generally as an example of one way to do longtermist charity evaluation (which currently seems to be rarely done in anything beyond a cursory or solely qualitative way, at least in public writings).
One question I have is what you saw as the key purposes of this post. Some possibilities:
Inform decisions about donations that are each in something like the $10-$5000 dollar range
Inform decisions about donations/āgrants that are each in something like the >$50,000 dollar range
(Obviously Iām missing the $5,000-$50,000 range; I have a vague sense that the more interesting question is which of those two buckets I pointed to youāre more focused on, if either)
Inform decisions about which of these orgs (if any) to work for
Provide feedback to these orgs that causes them to improve
Provide an accountability mechanism for these orgs that causes them to work harder or smarter so that they look better on such evaluations in future
Just see if this sort of evaluation can be done, learn more about how to do that, and share that meta-level info with the EA public
[something else]
I ask partly because:
My sense is that, traditionally, public charity evaluations are mostly focused on informing decisions by individual donors giving non-huge sums each
But this seems somewhat less relevant in longtermism than in other EA cause areas
Longtermism seems somewhat less funding-constrained than other EA cause areas
In longtermism compared to in other areas, evaluation seems harder for various reasons, and so the case for giving to a donation lottery or a fund whose dollars are distributed by specialist grantmakers seems stronger
But I guess your post, or other things like it, could mitigate this
But I still think some of the bottlenecks arenāt addressed, e.g. I think nonpublic info is more often relevant for evaluating longtermist orgs than for evaluating animal welfare orgs
Also, you rarely talked about room for more funding, which seems to imply you werenāt focused primarily on informing donors?
(To be clear, āwhat did you see as the key purposes of this post?ā is a sincere rather than rhetorical question, and I think this post is great.)
(I work for two of the orgs discussed in this post and as a grantmaker for a fund, but this commentāas usualāexpresses personal views only.)
I actually think it would be cool to have more posts that explicitly discuss which organizations people should go work at (and what might make it a good personal fit for them).
Thanks Michael. Going through your options one by one.
Inform decisions about donations that are each in something like the $10-$5000 dollar range. Not an aim I had, but sure, why not.
Inform decisions about donations/āgrants that are each in something like the >$50,000 dollar range. So rather than inform those directly, inform the kind of research that you can either do or buy with money to inform that donation. $50,000 feels a little bit low for commissioning research to make a decision, though (could a $5k to $10k investment in a better version of this post make a $50k donation more than 10-20% better? Plausibly.
That said, Iād be curious if any largish donations are changed as a result of this post, and why, and in particular why they didnāt defer to the LTF fund.
Inform decisions about which of these orgs (if any) to work for. Not really for myself, but Iād be happy for people to read this post as part of their decisions. Also, 80,000 hours exists.
Provide feedback to these orgs that causes them to improve. Sure, but not a primary aim.
Provide an accountability mechanism for these orgs that causes them to work harder or smarter so that they look better on such evaluations in future. No, not really.
Just see if this sort of evaluation can be done, learn more about how to do that, and share that meta-level info with the EA public. Yep.
[Something else]. Show the kind of thing that an organization like QURI can do! In particular, you canāt do this kind of thing using software other than foretold (Metaculus is great, but the questions are too ambiguous; getting them approved takes time & in the case of a tournament, money, and for this post I only needed my own predictions (not that you canāt run a tournament on foretold.))
[Something else]. Learn more about the longtermist ecosystem myself
[Something else]. So this was sort of on the edges of this project, but for making large amounts of predictions, one does need a pipeline, and improving that pipeline has been on my mind (and on Ozzie Gooenās). For instance, creating the 27 predictions one by one would be kind of a pain, so instead I use a Google doc script which feeds them to foretold.
I also think that 4. and 5. are too strongly worded. To the extent Iām providing feedback, I imagine itās more of a) of the sanity check variety or b) about how a relatively sane person perceives these organizations. For instance, if I donāt get pushback about it in the comments, Iāll think that its a good idea for the APPGFG to expand, but I doubt itās something that they themselves havenāt thought about.
+1, to both the questions and the answers.
In an ideal world weād have intense evaluations of all organizations that are specific to all possible uses, done in communications styles relevant to all people.
Unfortunately this is an impossible amount of work, so we have to find some messy shortcuts that get much of the benefit at a decent cost.
Iām not sure how to best focus longtermist organization evaluations to maximize gains for a diversity of types of decisions. Fortunately I think whenever one makes an evaluation for one specific thing (funding decisions), these wind up relevant for other things (career decisions, organization decisions).
My primary interest at this point are evaluations of the following:
How much total impact is an organization having, positive or negative?
How can such impact be improved?
How efficient is the organization (in terms of money and talent)
How valuable is it to other groups or individuals to read /ā engage with the work of this organization? (Think Yelp or Amazon reviews)
My guess is that such investigations will help answer a wide assortment of different questions.
To echo what NuƱo said, some of my interest in this specific task was in attempting a fairly general-purpose attempt. I think that increasingly substantial attempts is a pretty good bet, because a whole lot could either go wrong (this work upsets some group or includes falsities) or new ideas could be figured out (particularly by commenters, such as those on this post).
In the longer term my preference isnāt for QURI/āNuƱo to be doing the majority of public evaluations of longtermist orgs, but instead for others to do most of this work. Perhaps this could be something of a standard blog post type, and/āor there could be 1-2 small organizations dedicated to it. I think it really should be done independently from other large orgs (to be less biased and more isolated), so it probably wouldnāt make sense for this work to be done as part of a much bigger organization.
Also, Iād agree that <$1Mil funding decisions arenāt the main thing Iām interested in. I think that talent and larger allocations are much more exciting.
For example, perhaps itās realized that one small nonprofitās work is much more valuable than expected, so future donors wind up spending $200Mil in related work down the line. Or, there are many systematic effects, like new founders are inspired by trends identified in the evaluations and make better new nonprofits because of it.