Aaron Gertler 🔸 answers What should the norms around privacy and evaluation in the EA community be?

Aaron Gertler 🔸Jun 17, 2021, 5:34 AM
13 points
0 ∶ 0
I think it depends on how much information you have.
If the extent of your evaluation is a quick search for public info, and you don’t find much, I think the responsible conclusion is “it’s unclear what happened” rather than “something went wrong”. I think this holds even for projects that obviously should have public outputs if they’ve gone well. If someone got a grant to publish a book, and there’s no book, that might look like a failure—but they also might have been diagnosed with cancer, or gotten a sudden offer for a promising job that left them with no time to write. (In the latter case, I’d hope they would give the grant back, but that’s something a quick search probably wouldn’t find.)
(That said, it still seems to good to describe the search you did, just so future evaluators have something more to work with.)
On the other hand, if you’ve spoken to the person who got the grant, and they showed you their very best results, and you’re fairly sure you aren’t missing any critical information, it seems fine to publish a negative evaluation in almost every case (I say “almost” because this is a complicated question and possible exceptions abound.)
Depending on the depth of your search and the nature of the projects (haven’t read your post closely yet), I could see any of 1-5 being what I would do in your place.
- NunoSempere Jun 17, 2021, 8:31 AM
  6 points
  0 ∶ 0
  Parent
  If the extent of your evaluation is a quick search for public info, and you don’t find much, I think the responsible conclusion is “it’s unclear what happened” rather than “something went wrong”. I think this holds even for projects that obviously should have public outputs if they’ve gone well.
  So to push back against this, suppose that if you have four initial probabilities (legibly good, silently good, legibly bad, silently bad). Then you also have a ratio (legibly good + silently good) : (legibly bad + silently bad).
  Now if you learn that the project was not legibly good or legibly bad, then you update to (silently good, silently bad). The thing is, I expect this ratio silently good : silently bad to be different than the original (legibly good + silently good) : (legibly bad + silently bad), because I expect that most projects, when they fail, do so silently, but that a large portion of successes have a post written about them.
  For an intuition pump, suppose that none of the projects from the LTF had any information to be found online about them. Then this would probably be an update downwards. But what’s true about the aggregate seems also true probabilistically about the individual projects.
  So overall, because I disagree that the “Bayesian” conclusion is uncertainty, I do see a tension between the thing to do to maintain social harmony and the thing to do if one wants to transmit a maximal amount of information. I think this is particularly the case “for projects that obviously should have public outputs if they’ve gone well”.
  But then you also have other things, like:
  - Some areas (like independent research on foundational topics) might be much, much more illegible than others (e.g, organizing a conference)
  - Doing this kind of update might incentivize people to go into more legible areas
  - An error rate changes things in complicated ways. In particular, maybe the error rate in the evaluation increases the more negative the evaluation is (though I think that the opposite is perhaps more likely). This would depend on your prior about how good most interventions are.
  - ...
  - Aaron Gertler 🔸Jun 17, 2021, 8:51 AM
    4 points
    0 ∶ 0
    Parent
    I was too vague in my response here: By “the responsible conclusion”, I mean something like “what seems like a good norm for discussing an individual project” rather than “what you should conclude in your own mind”.
    I agree on silent success vs. silent failure and would update in the same way you would upon seeing silence from a project where I expected a legible output.
    If the book isn’t published in my example, it seems more likely that some mundane thing went poorly (e.g. book wasn’t good enough to publish) than that the author got cancer or found a higher-impact opportunity. But if I were reporting an evaluation, I would still write something more like “I couldn’t find information on this, and I’m not sure what happened” than “I couldn’t find information on this, and the grant probably failed”.
    (Of course, I’m more likely to assume and write about genuine failure based on certain factors: a bigger grant, a bigger team, a higher expectancy of a legible result, etc. If EA Funds makes a $1m grant to CFAR to share their work with the world, and CFAR’s website has vanished three years later, I wouldn’t be shy about evaluating that grant.)
    I’m more comfortable drawing judgments about an overall grant round. If there are ten grants, and seven of them are “no info, not sure what happened”, that seems like strong evidence that most of the grants didn’t work out, even if I’m not past the threshold of calling any individual grant a failure. I could see writing something like: “I couldn’t find information on seven of the ten grants where I expected to see results; while I’m not sure what happened in any given case, this represents much less public output than I expected, and I’ve updated negatively about the expected impact of the fund’s average grant as a result.”
    (Not that I’m saying an average grant necessarily should have a legible positive impact; hits-based giving is a thing. But all else being equal, more silence is a bad sign.)