I think it depends on how much information you have.
If the extent of your evaluation is a quick search for public info, and you donāt find much, I think the responsible conclusion is āitās unclear what happenedā rather than āsomething went wrongā. I think this holds even for projects that obviously should have public outputs if theyāve gone well. If someone got a grant to publish a book, and thereās no book, that might look like a failureābut they also might have been diagnosed with cancer, or gotten a sudden offer for a promising job that left them with no time to write. (In the latter case, Iād hope they would give the grant back, but thatās something a quick search probably wouldnāt find.)
(That said, it still seems to good to describe the search you did, just so future evaluators have something more to work with.)
On the other hand, if youāve spoken to the person who got the grant, and they showed you their very best results, and youāre fairly sure you arenāt missing any critical information, it seems fine to publish a negative evaluation in almost every case (I say āalmostā because this is a complicated question and possible exceptions abound.)
Depending on the depth of your search and the nature of the projects (havenāt read your post closely yet), I could see any of 1-5 being what I would do in your place.
If the extent of your evaluation is a quick search for public info, and you donāt find much, I think the responsible conclusion is āitās unclear what happenedā rather than āsomething went wrongā. I think this holds even for projects that obviously should have public outputs if theyāve gone well.
So to push back against this, suppose that if you have four initial probabilities (legibly good, silently good, legibly bad, silently bad). Then you also have a ratio (legibly good + silently good) : (legibly bad + silently bad).
Now if you learn that the project was not legibly good or legibly bad, then you update to (silently good, silently bad). The thing is, I expect this ratio silently good : silently bad to be different than the original (legibly good + silently good) : (legibly bad + silently bad), because I expect that most projects, when they fail, do so silently, but that a large portion of successes have a post written about them.
For an intuition pump, suppose that none of the projects from the LTF had any information to be found online about them. Then this would probably be an update downwards. But whatās true about the aggregate seems also true probabilistically about the individual projects.
So overall, because I disagree that the āBayesianā conclusion is uncertainty, I do see a tension between the thing to do to maintain social harmony and the thing to do if one wants to transmit a maximal amount of information. I think this is particularly the case āfor projects that obviously should have public outputs if theyāve gone wellā.
But then you also have other things, like:
Some areas (like independent research on foundational topics) might be much, much more illegible than others (e.g, organizing a conference)
Doing this kind of update might incentivize people to go into more legible areas
An error rate changes things in complicated ways. In particular, maybe the error rate in the evaluation increases the more negative the evaluation is (though I think that the opposite is perhaps more likely). This would depend on your prior about how good most interventions are.
I was too vague in my response here: By āthe responsible conclusionā, I mean something like āwhat seems like a good norm for discussing an individual projectā rather than āwhat you should conclude in your own mindā.
I agree on silent success vs. silent failure and would update in the same way you would upon seeing silence from a project where I expected a legible output.
If the book isnāt published in my example, it seems more likely that some mundane thing went poorly (e.g. book wasnāt good enough to publish) than that the author got cancer or found a higher-impact opportunity. But if I were reporting an evaluation, I would still write something more like āI couldnāt find information on this, and Iām not sure what happenedā than āI couldnāt find information on this, and the grant probably failedā.
(Of course, Iām more likely to assume and write about genuine failure based on certain factors: a bigger grant, a bigger team, a higher expectancy of a legible result, etc. If EA Funds makes a $1m grant to CFAR to share their work with the world, and CFARās website has vanished three years later, I wouldnāt be shy about evaluating that grant.)
Iām more comfortable drawing judgments about an overall grant round. If there are ten grants, and seven of them are āno info, not sure what happenedā, that seems like strong evidence that most of the grants didnāt work out, even if Iām not past the threshold of calling any individual grant a failure. I could see writing something like: āI couldnāt find information on seven of the ten grants where I expected to see results; while Iām not sure what happened in any given case, this represents much less public output than I expected, and Iāve updated negatively about the expected impact of the fundās average grant as a result.ā
(Not that Iām saying an average grant necessarily should have a legible positive impact; hits-based giving is a thing. But all else being equal, more silence is a bad sign.)
I think it depends on how much information you have.
If the extent of your evaluation is a quick search for public info, and you donāt find much, I think the responsible conclusion is āitās unclear what happenedā rather than āsomething went wrongā. I think this holds even for projects that obviously should have public outputs if theyāve gone well. If someone got a grant to publish a book, and thereās no book, that might look like a failureābut they also might have been diagnosed with cancer, or gotten a sudden offer for a promising job that left them with no time to write. (In the latter case, Iād hope they would give the grant back, but thatās something a quick search probably wouldnāt find.)
(That said, it still seems to good to describe the search you did, just so future evaluators have something more to work with.)
On the other hand, if youāve spoken to the person who got the grant, and they showed you their very best results, and youāre fairly sure you arenāt missing any critical information, it seems fine to publish a negative evaluation in almost every case (I say āalmostā because this is a complicated question and possible exceptions abound.)
Depending on the depth of your search and the nature of the projects (havenāt read your post closely yet), I could see any of 1-5 being what I would do in your place.
So to push back against this, suppose that if you have four initial probabilities (legibly good, silently good, legibly bad, silently bad). Then you also have a ratio (legibly good + silently good) : (legibly bad + silently bad).
Now if you learn that the project was not legibly good or legibly bad, then you update to (silently good, silently bad). The thing is, I expect this ratio silently good : silently bad to be different than the original (legibly good + silently good) : (legibly bad + silently bad), because I expect that most projects, when they fail, do so silently, but that a large portion of successes have a post written about them.
For an intuition pump, suppose that none of the projects from the LTF had any information to be found online about them. Then this would probably be an update downwards. But whatās true about the aggregate seems also true probabilistically about the individual projects.
So overall, because I disagree that the āBayesianā conclusion is uncertainty, I do see a tension between the thing to do to maintain social harmony and the thing to do if one wants to transmit a maximal amount of information. I think this is particularly the case āfor projects that obviously should have public outputs if theyāve gone wellā.
But then you also have other things, like:
Some areas (like independent research on foundational topics) might be much, much more illegible than others (e.g, organizing a conference)
Doing this kind of update might incentivize people to go into more legible areas
An error rate changes things in complicated ways. In particular, maybe the error rate in the evaluation increases the more negative the evaluation is (though I think that the opposite is perhaps more likely). This would depend on your prior about how good most interventions are.
...
I was too vague in my response here: By āthe responsible conclusionā, I mean something like āwhat seems like a good norm for discussing an individual projectā rather than āwhat you should conclude in your own mindā.
I agree on silent success vs. silent failure and would update in the same way you would upon seeing silence from a project where I expected a legible output.
If the book isnāt published in my example, it seems more likely that some mundane thing went poorly (e.g. book wasnāt good enough to publish) than that the author got cancer or found a higher-impact opportunity. But if I were reporting an evaluation, I would still write something more like āI couldnāt find information on this, and Iām not sure what happenedā than āI couldnāt find information on this, and the grant probably failedā.
(Of course, Iām more likely to assume and write about genuine failure based on certain factors: a bigger grant, a bigger team, a higher expectancy of a legible result, etc. If EA Funds makes a $1m grant to CFAR to share their work with the world, and CFARās website has vanished three years later, I wouldnāt be shy about evaluating that grant.)
Iām more comfortable drawing judgments about an overall grant round. If there are ten grants, and seven of them are āno info, not sure what happenedā, that seems like strong evidence that most of the grants didnāt work out, even if Iām not past the threshold of calling any individual grant a failure. I could see writing something like: āI couldnāt find information on seven of the ten grants where I expected to see results; while Iām not sure what happened in any given case, this represents much less public output than I expected, and Iāve updated negatively about the expected impact of the fundās average grant as a result.ā
(Not that Iām saying an average grant necessarily should have a legible positive impact; hits-based giving is a thing. But all else being equal, more silence is a bad sign.)