Habryka comments on Women’s Empowerment: Founders Pledge report and recommendations

Habryka 20 Dec 2018 18:28 UTC
10 points
0 ∶ 0
(Keeping this brief, and don’t have super much time to justify my full perspective, so not sure how much I will respond to comments)
I glanced at the methodology, which seemed relatively weak to me, and I looked at the recommendations which were all denominated in impact measures that I don’t care about and seemed arbitrary if you looked at it from a cause-neutral view.
I am generally not very excited about non-cause-neutral research on the forum, and this topic in particular seems like it would likely only have been analyzed because it’s a topic that’s popular, not because there is any a-priori reason to assume that the interventions in this area are particularly effective from a cause-neutral view.
I think it’s key that research in EA stays in a cause-neutral frame and tries to justify itself from that perspective. The state of the broader charity world suggests that there is a strong attractor in people choosing cause-areas that they are personally invested in, and then sticking to those, while justifying their decision to do so with post-hoc justifications. This research seems to mostly provide that post-hoc justification, which seems overall net-negative.
- Sjir Hoeijmakers🔸 20 Dec 2018 21:40 UTC
  8 points
  0 ∶ 0
  Parent
  I’d distinguish between two ways in which a report can ‘be’ cause-neutral:
  1. Whether its domain of focus/cause area was chosen purely through cause prioritisation
  2. Whether its contents are of value from a cause-neutral perspective
  Now I agree that this report is not cause-neutral on (1): it was written at least partially because many of FP’s community members are interested in women’s empowerment.*
  However, note that cause prioritisation is just a heuristic to restrict our domain of search: what you want to compare in the end are the (donation) opportunities themselves, not which cause/domain they happen to be in by some categorisation.
  Maybe you don’t think women’s empowerment should be the first domain to check when you are looking for the highest-impact charities overall, but you should at least agree that it is valuable from a cause-neutral perspective to know what the best charities within this particular domain of search are. You might then be surprised that they are actually better than you thought, or you might find that your intuition of other areas having better opportunities is confirmed.
  As the methodology of this report allows you to compare the charities to those in other areas (we don’t use outcome measures that are restricted to women’s empowerment/the analysis is done in a cause-neutral frame), I think it to be cause-neutral on (2). And I hence think it’s very much worth discussing (from a cause-neutral perspective of course!) its contents on the EA forum, e.g. how do the recommended charities compare to other near-term welfare opportunities, such as those recommended by GiveWell?
  Lastly, I don’t think this research provides a post-hoc justification for women’s empowerment: in my view it could have as much provided a justification to not donate in that area (if the best charities turn out to be worse than in other areas) as a justification to donate in that area. At FP we do research into areas not to justify our member’s initial preferences, but to be recommend high-impact opportunities tailored to those preferences (if high-impact opportunities are available), as well as to be able to make a solid, justified argument to focus on other areas (if higher-impact opportunities are available in those other areas).
  *This does not mean that the choice of writing this report was a non-cause-neutral choice: for FP to do the most good we obviously need to take our community’s preferences into account. Neither does it mean that one couldn’t arrive at women’s empowerment as a high-potential cause area through cause prioritisation.
  - Habryka 21 Dec 2018 0:02 UTC
    8 points
    0 ∶ 0
    Parent
    Maybe you don’t think women’s empowerment should be the first domain to check when you are looking for the highest-impact charities overall, but you should at least agree that it is valuable from a cause-neutral perspective to know what the best charities within this particular domain of search are. You might then be surprised that they are actually better than you thought, or you might find that your intuition of other areas having better opportunities is confirmed.
    I agree that comparisons of that type are valuable, but I don’t think that this report helps me much in doing that kind of comparison. This report did no comparative analysis of the interventions against other near-term welfare interventions, and you used denominations that make that comparison quite difficult (as SiebeRozendal pointed out in another comment).
    See for example this:
    This suggests that Village Enterprise’s programme can bring about nominal gains in consumption of about $0.99 for each $1.00 donated. Adjusting for purchasing power, this is equivalent to gains of $2.18 for each $1.00 donated.
    I don’t know how to compare an increase in consumption with other near-term interventions, so as long as this number isn’t shockingly high or low it’s quite hard for me to judge whether this is a good intervention. So while your analysis helps me a bit in comparing Village Enterprise to other near-term welfare charities, it really doesn’t help me much and I still need to put in the vast majority of work, which consists of building models about the world in how things like increases in consumption compare against direct reductions in disease burden (and then how those compare against increasing or decreasing the speed of technological progress, and other major methods of impact). The analysis has some use, but I think it’s relatively minor for the cases I am interested in.
    Lastly, I don’t think this research provides a post-hoc justification for women’s empowerment: in my view it could have as much provided a justification to not donate in that area (if the best charities turn out to be worse than in other areas) as a justification to donate in that area.
    I think the current framing of the post and report does not allow for the possibility of a negative recommendation, and I expect the casual reader to walk away with a mistaken sense that this has been chosen as a promising cause area comparable to other top cause areas. De-facto, even though the numbers seem on a first glance a lot worse than other top GiveWell recommendations, the post does not give a negative recommendation. I recognize that the report was written for a different audience than the core EA community, but I think that’s what makes it lose most of its value to me.
    - Sjir Hoeijmakers🔸 6 Jun 2019 14:29 UTC
      2 points
      0 ∶ 0
      Parent
      Hi Habryka, just wanted to draw your attention to the update above, which is in part referring to some of your comments that have been incorporated in the new version of the report. Thanks for those!
- Aaron Gertler 🔸 20 Dec 2018 23:11 UTC
  4 points
  0 ∶ 0
  Parent
  Thanks for writing this out, Habryka!
  These are all important considerations, and while I disagree about the strength of the methodology (it seems stronger than that of many posts I’ve seen be popular on the Forum), I agree that having a more comparison-friendly impact measure would have been good, as well as a justification for why we should care about this subfield within global development.
  ----
  I’m not sure how the Forum should generally regard “research into the best X charity” for values of “X” that don’t return organizations with metrics comparable to the best charities we know of.
  On the one hand, it can be genuinely useful for the community to be able to reach people who care about X by saying “with our tools, here’s what we might tell you, but if you trust this work, maybe also look at Y”.
  On the other hand, it may drain time and energy from research into causes that are more promising, or dilute the overall message of EA.
  I guess I’ll keep taking posts like this on a case-by-case basis for now, and I thought this particular case was worth a (non-strong) upvote. But I have a better understanding of why one might come to the opposite conclusion.
  - Habryka 21 Dec 2018 0:06 UTC
    12 points
    0 ∶ 0
    Parent
    I think this was the part of the report that made me distrust the methodology the most:
    Our research partner GiveWell[69] was an expert in the subfield and/or was building further expertise, and we thought it unlikely that we would find donation opportunities better than or equivalent to their current or near-future top charities within our timeframe for this research project (in the case of maternal health, family planning, HIV and other STDs, and health (other)).
    Even in the specific cause area, it seemed from the beginning likely that existing GiveWell top charities outperform the ones that this report might find (and from a casual glance at the actual impact values, this has been confirmed, with the impact from GiveWell top charities being at least 2x the impact of the top recommended charities here, such that even if you only care about women’s health you will probably get more value per dollar).
    It seems clear to me that in that case, the correct choice would have been to suggest GiveWell top charities as good interventions in this space, even if they are not explicitly targeting women’s empowerment. The fact that no single existing top-GiveWell charity was chosen suggests to me that a major filter that was applied to the prioritization was whether the charity explicitly branded itself as a charity dedicated to women’s empowerment, which I think should clearly be completely irrelevant, and made me highly suspicious of the broader process.
    - Aaron Gertler 🔸 21 Dec 2018 3:09 UTC
      7 points
      0 ∶ 0
      Parent
      Habryka: Did you see this line in the introduction of this post?
      We also recommend charities that are highly cost-effective in improving women’s lives but do not focus exclusively on women’s empowerment. We discuss these organisations, including those recommended by our research partner GiveWell, in other research reports on our website.
      On the other hand, it does seem like a specific GiveWell charity or two should have shown up on this list, or that FP should have explicitly noted GiveWell’s higher overall impact (if the impact actually was higher; it seems like GiveDirectly isn’t clearly better than Village Enterprise or Bandhan at boosting consumption, at least based on my reading of p. 5o of the 2018 GD study, which showed a boost of roughly 0.3 standard deviations in monthly consumption vs. 0.2-0.4 SDs for Bandhan’s major RCT, though there are lots of other factors in play).
      I think I’ve come halfway around to your view, and would need to read GiveWell and FP studies much more carefully to figure out how I feel about the other half (that is, whether GiveWell charities really do dominate FP’s selections).
      I’d also have to think more about whether second-order effects of the FP recommendations might be important enough to offset differences in the benefits GiveWell measures (e.g. systemic change in norms around sexual assault in some areas—I don’t think I’d end up being convinced without more data, though).
      Finally, I’ll point out that this post had some good features worth learning from, even if the language around recommending organizations wasn’t great:
      The “why is our recommendation provisional” section around NMNW, which helped me better understand the purpose and audience of FP’s evaluation, and also seems like a useful idea in general (“if your values are X, this seems really good; if Y, maybe not good enough”).
      The discussion of how organizations were chosen, and the ways in which they were whittled down (found in the full report).
      On the other hand, I didn’t like the introduction, which used a set of unrelated facts to make a general point about “challenges” without making an argument for focusing on “women’s empowerment” over “human empowerment”. I can imagine such an argument being possible (e.g. women are an easy group to target within a population to find people who are especially badly-off, and for whom marginal resources are especially useful), but I can’t tell what FP thinks of it.
      - Habryka 21 Dec 2018 17:06 UTC
        5 points
        0 ∶ 0
        Parent
        Note that GiveDirectly in general is a bit of a weird outlier in terms of GiveWell top recommendations, because it’s a lot less cost-effective than the other charities, but is very useful as a “standard candle” for evaluating whether an intervention is potentially a good target for donations. I think being better than GiveDirectly is not sufficient to be a top recommendation for a cause area.
        Methodologically, I do think there are a variety of reasons for why you should estimate a regression to the mean in these impact estimates, more so than for GiveDirectly, in large parts because the number of studies in the space is lot lower, and the method of impact is a lot more complicated in a way that allows for selective reporting.
      - Habryka 21 Dec 2018 3:55 UTC
        4 points
        0 ∶ 0
        Parent
        I did not see that line! I apologize for not reading thoroughly enough.
        I do think that makes a pretty big difference, and I retract at least part of my critique, though basically agree with the points you made.
        Sjir Hoeijmakers🔸 21 Dec 2018 13:50 UTC
        1 point
        0 ∶ 0
        Parent
        No problem, thanks for your comments anyway and please let me know if any part of your critique remains that I haven’t engaged with. (Please see edit in main post which should have cleared most up)
        Habryka 21 Dec 2018 16:11 UTC
        1 point
        0 ∶ 0
        Parent
        I think most of my critique still stands, and I am still confused why the report does not actually recommend any GiveWell top charities. The fact that the report is limiting itself to charities that exclusively focus on women’s empowerment seems like a major constraint that makes the investigation a lot less valuable from a broad cause-prioritization perspective (and also for donors who actually care about reducing women’s empowerment, since it seems very likely that the best charities that achieve that aim do not aim to achieve that target exclusively).
      - Sjir Hoeijmakers🔸 21 Dec 2018 13:44 UTC
        3 points
        0 ∶ 0
        Parent
        Habryka: Did you see this line in the introduction of this post?
        Thanks for pointing this out, Aaron! Happy that’s cleared up.
        On the other hand, it does seem like a specific GiveWell charity or two should have shown up on this list, or that FP should have explicitly noted GiveWell’s higher overall impact (if the impact actually was higher; it seems like GiveDirectly isn’t clearly better than Village Enterprise or Bandhan at boosting consumption, at least based on my reading of p. 5o of the 2018 GD study, which showed a boost of roughly 0.3 standard deviations in monthly consumption vs. 0.2-0.4 SDs for Bandhan’s major RCT, though there are lots of other factors in play).
        I think I’ve come halfway around to your view, and would need to read GiveWell and FP studies much more carefully to figure out how I feel about the other half (that is, whether GiveWell charities really do dominate FP’s selections).
        Please see my updates in the main post and let me know if you still have questions about this. (Do you now understand why we didn’t recommend any other specific GW- or FP-recommended charity in this report, but referred to them as a group?)
        On the other hand, I didn’t like the introduction, which used a set of unrelated facts to make a general point about “challenges” without making an argument for focusing on “women’s empowerment” over “human empowerment”. I can imagine such an argument being possible (e.g. women are an easy group to target within a population to find people who are especially badly-off, and for whom marginal resources are especially useful), but I can’t tell what FP thinks of it.
        I hope the reason for this is now also clearer, given the purpose of the report.
        Habryka 21 Dec 2018 16:58 UTC
        23 points
        0 ∶ 0
        Parent
        Please see my updates in the main post and let me know if you still have questions about this. (Do you now understand why we didn’t recommend any other specific GW- or FP-recommended charity in this report, but referred to them as a group?)
        As I mentioned in the other comment, I am still not sure why you do not recommend any GW top charities directly. It seems like your report should answer the question “what charities improve women’s health the most?” not the question “what charities that exclusively focus on women’s health are most effective?”. The second one is a much narrower question and its answer will probably not overlap much with the answer to the first question.
        You mention them, but only in a single paragraph. It seems that even from the narrow value perspective of “I only care about women’s empowerment” the question of “are women helped more by GiveWell charities or the charities recommended here?” is a really key question that your report should try to answer.
        The top of your report also says the following:
        We researched charity programmes to find those that most cost-effectively improve the lives of women and girls.
        This however does not actually seem to be the question you are answering, as I mentioned above. I expect the best interventions for women’s empowerment to not exclusively focus on doing so (because there are many many more charities trying to improve overall health, because women’s empowerment seems like it would overlap a lot with general health goals, etc). I even expect them to not overlap that much with GiveWell’s recommendations, though that’s a critique on a higher level that I think we can ignore for now.
        To be transparent about my criticism here, the feeling that I’ve gotten from this report, is that the goal of the report was not to answer the question of “how can we best achieve the most good for the value of women’s empowerment?” but was instead focusing on the question “what set of charity recommendations will most satisfy our potential donors, by being rigorous and seeming to cover most of the areas we are supposed to check”.
        To be clear, I think the vast majority of organizations fall into this space, even in EA, and I have roughly similar (though weaker) criticisms for GiveWell itself, which focuses on global development charities in a pretty unprincipled way that I think has a lot to do with global development being transparent in a way that more speculative interventions are not (though most of the key staff has switched from GiveWell to OpenPhil now, I think in parts because of the problems of that approach that I am criticizing here).
        I think focusing on that transparency can sometimes be worth it for an individual organization in the long run by demonstrating good judgement and therefore attracting additional resources (as it did in the case of GiveWell), but generally results in the work not being particularly useful for answering the real question of “how can we do the most good?”.
        And on the margin I think that that kind of research is net-harmful for the overall quality of research and discussion on general cause-prioritization by spreading a methodology that is badly suited for answering the much more difficult questions of that domain (similarly to how p-testing has had a negative effect on psychology research, by it being a methodology that is badly suited for the actual complexity of the domain, while still being well-suited to answer questions in a much narrower domain).
        I think overall this report is pretty high-quality by the standards of global development research, but a large number of small things (the choice of focus area, limiting yourself to charities exclusively focused on women’s empowerment, the narrow methodological focus, and I guess my priors for orgs working in this space) give me the sense that this report was not primarily written with the goal of answering the question “what interventions will actually improve women’s lives?” but was instead more trying to do a broad thing, a large part of which was to look rigorous and principled, conform to what your potential donors expect from a rigorous report, be broadly defensible, and fit with the skills and methodologies that your current team has (because those are the skills that are prevalent in the global development community).
        And I think all of those aims are reasonable aims for the goal of FP, I just think they together make me expect that EAs with a different set of aims will not benefit much from engaging with this research, and because you can’t be fully transparent about those aims (because doing so would confuse your primary audience or be perceived as deceptive), it will inevitably confuse at least some of the people trying to do something that is more aligned with my aims and detract from what I consider key cause-prioritization work.
        This overall leaves me in a place where I am happy about this research and FP existing, and think it will cause valuable resources to be allocated towards important projects, but where I don’t really want a lot more of it to show up on the EA Forum. I respect your work and think what you are doing is broadly good (though I obviously always have recommendations for things I would do differently).
        Sjir Hoeijmakers🔸 14 Feb 2019 12:10 UTC
        6 points
        0 ∶ 0
        Parent
        Hi Habryka,
        This is to thank you (and others) once more for all your comments here, and to let you know they have been useful and we have incorporated some changes to account for them in a new version of the report, which will be published in March or April. They were also useful in our internal discussion on how to frame our research, and we plan to keep improving our communication around this throughout the rest of the year, e.g. by publishing a blog post / brief on cause prioritisation for our members.
        I also largely agree with the views you express in your last post above, insofar as they pertain to the contents of this report specifically. However, very importantly, I should stress that your comments do not apply to FP research generally: we generally choose the areas we research through cause prioritisation / in a cause neutral way, and we do try to answer the question ‘how can we achieve the most good’ in the areas we investigate, not (even) shying away from harder-to-measure impact. In fact, we are moving more and more in the latter direction, and are developing research methodology to do so (see e.g. our recently published methodology brief on policy interventions).
        Some of our reports so far have been an exception to these rules for pragmatic (though impact-motivated) reasons, mainly:
        We quickly needed to build a large enough ‘basic’ portfolio of relatively high-impact charities, so that we could make good recommendations to our members.
        There are some causes our members ask lots of questions about / are extra interested in, and we want to be able to say something about those areas, even if we in the end recommend them to focus on other areas instead, when we find better opportunities there.
        But there’s definitely ways in which we can improve the framing of these exceptions, and the comments you provided have already been helpful in that way.
    - kbog 21 Dec 2018 1:58 UTC
      4 points
      0 ∶ 0
      Parent
      Good point, though what about the $60/sexual assault one? That impact even seems better than AMF for combined impact.