Doctor from NZ, independent researcher (grand futures / macrostrategy) collaborating with FHI / Anders Sandberg. Previously: Global Health & Development research @ Rethink Priorities.
Feel free to reach out if you think there’s anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you’re a medical student / junior doctor reconsidering your clinical future, or if you’re quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.
Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and Oly-lifter.
All comments in personal capacity unless otherwise stated.
bruce
To add sources to recent examples that come to mind that broadly support MHR’s point above RE: visible (ex post) failures that don’t seem to be harshly punished, (most seem somewhere between neutral to supportive, at least publicly).
Lightcone
Alvea
ALERT
AI Safety Support
EA hub
No Lean SeasonSome failures that came with a larger proportion of critical feedback probably include the Carrick Flynn campaign (1, 2, 3), but even here “harshly punish” seems like an overstatement. HLI also comes to mind (and despite highly critical commentary in earlier posts, I think the highly positive response to this specific post is telling).
============
On the extent to which Nonlinear’s failures relate to integrity / engineering, I think I’m sympathetic to both Rob’s view:
I think the failures that seem like the biggest deal to me (Nonlinear threatening people and trying to shut down criticism and frighten people) genuinely are matters of character and lack of integrity, not matters of bad engineering.
As well as Holly’s:
If you wouldn’t have looked at it before it imploded and thought the engineering was bad, I think that’s the biggest thing that needs to change. I’m concerned that people still think that if you have good enough character (or are smart enough, etc), you don’t need good boundaries and systems.
but do not think these are necessarily mutually exclusive.
Specifically, it sounds like Rob is mainly thinking about the source of the concerns, and Holly is thinking about what to do going forwards. And it might be the case that the most helpful actionable steps going forward are things that look more like improving boundaries and systems, regardless of whether you believe failures specific to Nonlinear are caused by deficiencies in integrity or engineering.That said, I agree with Rob’s point that the most significant allegations raised about Nonlinear quite clearly do not fit the category of ‘appropriate experimentation that the community would approve of’, under almost all reasonable perspectives.
I was a participant and largely endorse this comment.
one contributor to a lack of convergence was attrition of effort and incentives. By the time there was superforecaster-expert exchange, we’d been at it for months, and there weren’t requirements for forum activity (unlike the first team stage)
[Edit: wrote this before I saw lilly’s comment, would recommend that as a similar message but ~3x shorter].
============
I would consider Greg’s comment as “brought up with force”, but would not consider it an “edge case criticism”. I also don’t think James / Alex’s comments are brought up particularly forcefully.
I do think it is worth making a case that pushing back on making comments that are easily misinterpreted or misleading are also not edge case criticisms though, especially if these are comments that directly benefit your organisation.
Given the stated goal of the EA community is “to find the best ways to help others, and put them into practice”, it seems especially important that strong claims are sufficiently well-supported, and made carefully + cautiously. This is in part because the EA community should reward research outputs if they are helpful for finding the best ways to do good, not solely because they are strongly worded; in part because EA donors who don’t have capacity to engage at the object level may be happy to defer to EA organisations/recommendations; and in part because the counterfactual impact diverted from the EA donor is likely higher than the average donor.
For example:
“We’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money”.[1]
Michael has expressed regret about this statement, so I won’t go further into this than I already have. However, there is a framing in that comment that suggests this is an exception, because “HLI is quite well-caveated elsewhere”, and I want to push back on this a little.
HLI has previously been mistaken for an advocacy organisation (1, 2). This isn’t HLI’s stated intention (which is closer to a “Happiness/Wellbeing GiveWell”). I outline why I think this is a reasonable misunderstanding here (including important disclaimers that outline HLI’s positives).
Despite claims that HLI does not advocate for any particular philosophical view, I think this is easily (and reasonably) misinterpreted.
James’ comment thread below: “Our focus on subjective wellbeing (SWB) was initially treated with a (understandable!) dose of scepticism. Since then, all the major actors in effective altruism’s global health and wellbeing space seem to have come around to it”
See alex’s comment below, where TLYCS is quoted to say: “we will continue to rely heavily on the research done by other terrific organizations in this space, such as GiveWell, Founders Pledge, Giving Green, Happier Lives Institute [...]”
I think excluding “to identify candidates for our recommendations, even as we also assess them using our own evaluation framework” [emphasis added] gives a fairly different impression to the actual quote, in terms of whether or not TLYCS supports WELLBYs as an approach.
While I wouldn’t want to exclude careless communication / miscommunication, I can understand why others might feel less optimistic about this, especially if they have engaged more deeply at the object level and found additional reasons to be skeptical.[2] I do feel like I subjectively have a lower bar for investigating strong claims by HLI than I did 7 or 8 months ago.
(commenting in personal capacity etc)
============
Adding a note RE: Nathan’s comment below about bad blood:
Just for the record, I don’t consider there to be any bad blood between me and any members of HLI. I previously flagged a comment I wrote with two HLI staff, worrying that it might be misinterpreted as uncharitable or unfair. Based on positive responses there and from other private discussions, my impression is that this is mutual.[3]- ^
-This as the claim that originally prompted me to look more deeply into the StrongMinds studies. After <30 minutes on StrongMinds’ website, I stumbled across a few things that stood out as surprising, which prompted me to look deeper. I summarise some thoughts here (which has been edited to include a compilation of most of the critical relevant EA forum commentary I have come across on StrongMinds), and include more detail here.
-I remained fairly cautious about claims I made, because this entire process took three years / 10,000 hours, so I assumed by default I was missing information or that there was a reasonable explanation.
-However, after some discussions on the forum / in private DMs with HLI staff, I found it difficult to update meaningfully towards believing this statement was a sufficiently well-justified one. I think a fairly charitable interpretation would be something like “this claim was too strong, it is attributable to careless communication, but unintentional.”
- ^
Quotes above do not imply any particular views of commentors referenced.
- ^
I have not done this for this message, as I view it as largely a compilation of existing messages that may help provide more context.
- Jul 11, 2023, 8:48 AM; 1 point) 's comment on The Happier Lives Institute is funding constrained and needs you! by (
A commonly used model in the trust literature (Mayer et al., 1995) is that trustworthiness can be broken down into three factors: ability, benevolence, and integrity.
RE: domain specific, the paper incorporates this under ‘ability’:
The domain of the ability is specific because the trustee may be highly competent in some technical area, affording that person trust on tasks related to that area. However, the trustee may have little aptitude, training, or experience in another area, for instance, in interpersonal communication. Although such an individual may be trusted to do analytic tasks related to his or her technical area, the individual may not be trusted to initiate contact with an important customer. Thus, trust is domain specific.
There are other conceptions but many of them describe something closer to trust that is domain specific rather than generalised.
...All of these are similar to ability in the current conceptualization. Whereas such terms as expertise and competence connote a set of skills applicable to a single, fixed domain (e.g., Gabarro’s interpersonal competence), ability highlights the task- and situation-specific nature of the construct in the current model.
This is a conversation I have a fair amount when I talk to non-EA + non-medical friends about work, some quick thoughts:
If someone asks me Qs around DALYs at all (i.e. “why measure”), I would point to general cases where this happens fairly uncontroversially, e.g.:-If you were in charge of the health system, how would you choose to distribute the resources you get?
-If you were building a hospital, how would you go about choosing how to allocate your wards to different specialties?
-If you were in an emergency waiting room and you had 10 people in the waiting room, how would you choose who to see first?
These kinds of questions entail some kind of “diverting resources from one person to another” in a way that is pretty understandable (though they also point to reasonable considerations for why you might not only use DALYs in those contexts)
If someone is challenging me over using DALYs in context of it being a measurement system that is potentially ableist, then I generally just agree—it is indeed ableist by some framings![1]
Though, often in these conversations the underlying theme isn’t necessarily a “I have a problem with healthcare prioritisation” but a general sense that disabled folk aren’t receiving enough resources for their needs—so when having these conversations it’s important to acknowledge that disabled folk do just face a lot more challenges navigating the healthcare system (and society generally) through no fault of their own, and that we haven’t worked out the answers to prioritising accordingly or for solving the barriers that disabled folk face.
If the claim goes further and is explicitly saying interventions for disabilities are more cost effective than current DALYs approach give them credit for, then that’s also worth considering—though the standard would correspondingly increase if they are suggesting a new approach to resource allocation—as Larks’ comment illustrates, it is difficult to find an singular approach / measure that doesn’t push against intuitions or have something problematic at the policy level.[2]
On how you’re feeling when talking about prioritising:But then I feel like I’m implicitly saying something about valuing some people’s lives less than others, or saying that I would ultimately choose to divert resources from one person’s suffering to another’s.
This makes sense, though I do think there is a decent difference between the claim of “some people’s lives are worth more than others” and the claim of “some healthcare resources go further in one context than others (and thus justify the diversion)”. For example, I think if you never actively deprioritised anyone you would end up implicitly/passively prioritising based on things like [who can afford to go to the hospital / who lives closer / other access constraints]. But these are going to be much less correlated to what people care about when they say “all lives are equal”.
But if we have data on what the status quo is, then “not prioritising” / “letting the status quo happen” is still a choice we are making! And so we try to improve on the status quo and save more lives, precisely because we don’t think the 1000 patients on diabetes medication is worth less than the one cancer patient on a third-line immunotherapy.
- ^
E.g., for DALYs, the disability weight of 1 person with (condition A+B) is mathematically forced to be lower than the combined disability weight of two separate individuals with condition A and condition B respectively. That means for any cure of condition A, those who have only condition A would theoretically be prioritised under the DALY framework than those who have other health issues (e.g. have a disability). While I don’t have a good sense of when/if this specific part of the DALY framework has impacted resource allocation in practice, it is important to acknowledge the (many!) limitations the measures we use have.
- ^
Also, different folks within the disability community also have a wide range of views around what it means to live with a disability / be a disabled person (e.g. functional VS social models of disability), so it’s not actually clear that e.g., WELLBYs would necessarily lead to more healthcare resources in that direction, depending on which groups you were talking to.
- ^
Historical Global Health R&D “hits”: Development, main sources of funding, and impact
Thanks for writing this! RE: We would advise against working at Conjecture
We think there are many more impactful places to work, including non-profits such as Redwood, CAIS and FAR; alignment teams at Anthropic, OpenAI and DeepMind; or working with academics such as Stuart Russell, Sam Bowman, Jacob Steinhardt or David Krueger. Note we would not in general recommend working at capabilities-oriented teams at Anthropic, OpenAI, DeepMind or other AGI-focused companies.
Additionally, Conjecture seems relatively weak for skill building [...] We expect most ML engineering or research roles at prominent AI labs to offer better mentorship than Conjecture. Although we would hesitate to recommend taking a position at a capabilities-focused lab purely for skill building, we find it plausible that Conjecture could end up being net-negative, and so do not view Conjecture as a safer option in this regard than most competing firms.
I don’t work in AI safety and am not well-informed on the orgs here, but did want to comment on this as this recommendation might benefit from some clarity about who the target audience is.
As written, the claims sound something like:
CAIS et al., alignment teams at Anthropic et al., and working with Stuart Russel et al., are better places to work than Conjecture
Though not necessarily recommended, capabilities research at prominent AI labs is likely to be better than working at Conjecture for skill building, since Conjecture is not necessarily safer.
However:
The suggested alternatives don’t seem like they would be able to absorb a significant amount of additional talent, especially given the increase in interest in AI.
I have spoken to a few people working in AI / AI field building who perceive mentoring to be a bottleneck in AI safety at the moment.
If both of the above are true, what would your recommendation be to someone who had an offer from Conjecture, but not your recommended alternatives? E.g., choosing between independent research funded by LTFF VS working for Conjecture?
Just seeking a bit more clarity about whether this recommendation is mainly targeted at people who might have a choice between Conjecture and your alternatives, or whether this is a blanket recommendation that one should reject offers from Conjecture, regardless of seniority and what their alternatives are, or somewhere in between.
Thanks again!
Better weather forecasting: Agricultural and non-agricultural benefits in low- and lower-middle-income countries
bruce’s Quick takes
Some very quick thoughts from EY’s TIME piece from the perspective of someone ~outside of the AI safety work. I have no technical background and don’t follow the field closely, so likely to be missing some context and nuance; happy to hear pushback!
Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
My immediate reaction when reading this was something like “wow, is this representative of AI safety folks? Are they willing to go to any lengths to stop AI development?”. I’ve heard anecdotes of people outside of all this stuff saying this piece reads like a terrorist organisation, for example, which I think is a stronger term than I’d describe, but I think suggestions like this does unfortunately play into potential comparisons to ecofascists.
As someone seen publicly to be a thought leader and widely regarded as a founder of the field, there are some risks to this kind of messaging. It’s hard to evaluate how this trades off, but I definitely know communities and groups that would be pretty put off by this, and it’s unclear how much value the sentences around willingness to escalate nuclear war are actually adding.
It’s an empirical Q about how to tradeoff between risks from nuclear war and risks from AI, but the claim of “preventing AI extinction is a priority above a nuclear exchange” is ~trivially true; the reverse is also true: “preventing extinction from nuclear war is a priority above preventing AI training runs”. Given the difficulty of illustrating and defending a position that the risks of AI training runs is substantially higher than that of a nuclear exchange to the general public, I would have erred on the side of caution when saying things that are as politically charged as advocating for nuclear escalation (or at least something can be interpreted as such).
I wonder which superpower EY trusts to properly identify a hypothetical “rogue datacentre” that’s worthy of a military strike for the good of humanity, or whether this will just end up with parallels to other failed excursions abroad ‘for the greater good’ or to advance individual national interests.
If nuclear weapons are a reasonable comparison, we might expect limitations to end up with a few competing global powers to have access to AI developments, and countries that do not. It seems plausible that criticism around these treaties being used to maintain the status quo in the nuclear nonproliferation / disarmament debate may be applicable here too.
Unlike nuclear weapons (though nuclear power may weaken this somewhat), developments in AI has the potential to help immensely with development and economic growth.
Thus the conversation may eventually bump something that looks like:
Richer countries / first movers that have obtained significant benefits of AI take steps to prevent other countries from catching up.[1]
Rich countries using the excuse of preventing AI extinction as a guise to further national interests
Development opportunities from AI for LMICs are similarly hindered, or only allowed in a way that is approved by the first movers in AI.
Given the above, and that conversations around and tangential to AI risk already receive some pushback from the Global South community for distracting and taking resources away from existing commitments to UN Development Goals, my sense is that folks working in AI governance / policy would likely strongly benefit from scoping out how these developments are affecting Global South stakeholders, and how to get their buy-in for such measures.
(disclaimer: one thing this gestures at is something like—“global health / development efforts can be instrumentally useful towards achieving longtermist goals”[2], which is something I’m clearly interested in as someone working in global health. While it seems rather unlikely that doing so is the best way of achieving longtermist goals on the margin[3], it doesn’t exclude some aspect of this in being part of a necessary condition for important wins like an international treaty, if that’s what is currently being advocated for. It is also worth mentioning because I think this is likely to be a gap / weakness in existing EA approaches).
In our new report, The Elephant in the Bednet, we show that the relative value of life-extending and life-improving interventions depends very heavily on the philosophical assumptions you make. This issue is usually glossed over and there is no simple answer.
We conclude that the Against Malaria Foundation is less cost-effective than StrongMinds under almost all assumptions. We expect this conclusion will similarly apply to the other life-extending charities recommended by GiveWell.
In suggesting James quote these together, it sounds like you’re saying something like “this is a clear caveat to the strength of recommendation behind StrongMinds, HLI doesn’t recommend StrongMinds as strongly as the individual bullet implies, it’s misleading for you to not include this”.
But in other places HLI’s communication around this takes on a framing of something closer to “The cost effectiveness of AMF, (but not StrongMinds) varies greatly under these assumptions. But the vast majority of this large range falls below the cost effectiveness of StrongMinds”. (extracted quotes in footnote)[1]
As a result of this framing, despite the caveat that HLI “[does] not advocate for any particular view”, I think it’s reasonable to interpret this as being strongly supportive of StrongMinds, which can be true even if HLI does not have a formed view on the exact philosophical view to take.[2]
If you did mean the former (that the bullet about philosophical assumptions is primarily included as a caveat to the strength of recommendation behind StrongMinds), then there is probably some tension here between (emphasis added):
-”the relative value of life-extending and life-improving interventions depends very heavily on the philosophical assumptions you make...there is no simple answer”, and
-”We conclude StrongMinds > AMF under almost all assumptions”
Additionally I think some weak evidence to suggest that HLI is not as well-caveated as it could be is that many people (mistakenly) viewed HLI as an advocacy organisation for mental health interventions. I do think this is a reasonable outside interpretation based on HLI’s communications, even though this is not HLI’s stated intent. For example, I don’t think it would be unreasonable for an outsider to read your current pinned thread and come away with conclusions like:
“StrongMinds is the best place to donate”,
“StrongMinds is better than AMF”,
“Mental health is a very good place to donate if you want to do the most good”,
“Happiness is what ultimately matters for wellbeing and what should be measured”.
If these are not what you want people to take away, then I think pointing to this bullet point caveat doesn’t really meaningfully address this concern—the response kind of feels something like “you should have read the fine print”. While I don’t think it’s not necessary for HLI to take a stance on specific philosophical views, I do think it becomes an issue if people are (mis)interpreting HLI’s stance based on its published statements.
(commenting in personal capacity etc)
- ^
-We show how much cost-effectiveness changes by shifting from one extreme of (reasonable) opinion to the other. At one end, AMF is 1.3x better than StrongMinds. At the other, StrongMinds is 12x better than AMF.
-StrongMinds and GiveDirectly are represented with flat, dashed lines because their cost-effectiveness does not change under the different assumptions.
-As you can see, AMF’s cost-effectiveness changes a lot. It is only more cost-effective than StrongMinds if you adopt deprivationism and place the neutral point below 1.
- ^
As you’ve acknowledged, comments like “We’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money.” perhaps add to the confusion.
- Jul 10, 2023, 10:22 PM; 45 points) 's comment on The Happier Lives Institute is funding constrained and needs you! by (
- Jul 11, 2023, 8:48 AM; 1 point) 's comment on The Happier Lives Institute is funding constrained and needs you! by (
That makes sense, thanks for clarifying!
If I understand correctly, the updated figures should then be:
For 1 person being treated by StrongMinds (excluding all household spillover effects) to be worth the WELLBYs gained for a year of life[1] with HLI’s methodology, the neutral point needs to be at least 4.95-3.77 = 1.18.
If we include spillover effects of StrongMinds (and use the updated / lower figures), then the benefit of 1 person going through StrongMinds is 10.7 WELLBYs.[2] Under HLI’s estimates, this is equivalent to more than two years of wellbeing benefits from the average life, even if we set the neutral point at zero. Using your personal neutral point of 2 would suggest the intervention for 1 person including spillovers is equivalent to >3.5 years of wellbeing benefits. Is this correct or am I missing something here?
1.18 as the neutral point seems pretty reasonable, though the idea that 12 hours of therapy for an individual is worth the wellbeing benefits of 1 year of an average life when only considering impacts to them, and anywhere between 2~3.5 years of life when including spillovers does seem rather unintuitive to me, despite my view that we should probably do more work on subjective wellbeing measures on the margin. I’m not sure if this means:
WELLBYs as a measure can’t capturing what I care about in a year of healthy life, so we should not use solely WELLBYs when measuring wellbeing;
HLI isn’t applying WELLBYs in a way that captures the benefits of a healthy life;
The existing way of estimating 1 year of life via WELLBYs is wrong in some other way (e.g. the 4.95 assumption is wrong, the 0-10 scale is wrong, the ~1.18 neutral point is wrong);
HLI have overestimated the benefits of StrongMinds;
I have a very poorly calibrated view of how good / bad 12 hours of therapy / a year of life is worth, though this seems less likely.
Would be interested in your thoughts on this / let me know if I’ve misinterpreted anything!
- ^
More precisely, the average wellbeing benefits from 1 year of life from an adult in 6 African countries
- ^
Thanks Joel.
this comparison, as it stands, doesn’t immediately strike me as absurd. Grief has an odd counterfactual. We can only extend lives. People who’re saved will still die and the people who love them will still grieve. The question is how much worse the total grief is for a very young child (the typical beneficiary of e.g., AMF) than the grief for the adolescent, or a young adult, or an adult, or elder they’d become
My intuition, which is shared by many, is that the badness of a child’s death is not merely due to the grief of those around them. So presumably the question should not be comparing just the counterfactual grief of losing a very young child VS an [older adult], but also “lost wellbeing” from living a net-positive-wellbeing life in expectation?
I also just saw that Alex claims HLI “estimates that StrongMinds causes a gain of 13 WELLBYs”. Is this for 1 person going through StrongMinds (i.e. ~12 hours of group therapy), or something else? Where does the 13 WELLBYs come from?
I ask because if we are using HLI’s estimates of WELLBYs per death averted, and use your preferred estimate for the neutral point, then 13 / (4.95-2) is >4 years of life. Even if we put the neutral point at zero, this suggests 13 WELLBYs is worth >2.5 years of life.[1]
I think I’m misunderstanding something here, because GiveWell claims “HLI’s estimates imply that receiving IPT-G is roughly 40% as valuable as an additional year of life per year of benefit or 80% of the value of an additional year of life total.”
Can you help me disambiguate this? Apologies for the confusion.
- ^
13 / 4.95
- ^
To be a little more precise:
HLI’s estimates imply, for example, that a donor would pick offering StrongMinds’ intervention to 20 individuals over averting the death of a child, and that receiving StrongMinds’ program is 80% as good for the recipient as an additional year of healthy life.
I.e., is it your view that 4-8 weeks of group therapy (~12 hours) for 20 people is preferable to averting the death of a child?
it seems low cost and potentially quite valuable to put up a title and perhaps just a one-para abstract of all the projects you have done/are doing
This is a great suggestion, thanks!
Thanks for this! Yeah, the research going out of date is definitely a relevant concern in some faster-moving areas. RE: easiest to put it up ~immediately—I think if our reports for clients could just be copy pasted to a public facing version for a general audience this would be true, but in practice this is often not the case, e.g. because the client has some underlying background knowledge that would be unreasonable to expect the public to have, running quotes by interviewees to see if they’re happy with being quoted publicly etc.
There’s a direct tradeoff here between spending time on turning a client-facing report to a public-facing version and just starting the next client-facing report. In most cases we’ve just prioritised the next client-facing report, but it is definitely something we want to think more about going forward, and I think our most recent round of hires has definitely helped with this.
In an ideal world the global health team just has a lot of unrestricted funding to use so we can push these things out in parallel etc, in part because it is one way (among many others we’d like to explore) of helping us increase the impact of research we’ve already done, and also because this would provide extra feedback loops that can improve our own process + work.
Thanks for engaging! I’ll speak for myself here, though others might chime in or have different thoughts.
How do you determine if you’re asking the right questions?
Generally we ask our clients at the start something along the lines of “what question is this report trying to help answer for you?”. Often this is fairly straightforward, like “is this worth funding”, or “is this worth more researcher hours in exploring”. And we will often push back or add things to the brief to make sure we include what is most decision-relevant within the timeframe we are allocated. An example of this is when we were asked to look into the landscape of the philanthropy spending for cause area X, but it turns out that excluding the non-philanthropic spending might end up being pretty decision relevant, so we suggested incorporating that into the report.
We have multiple check-ins with our client to make sure the information we get is the kind of information they want, and to have opportunities to pivot if new questions come up as a result of what we find that might be more decision-relevant.
What is your process for judging information quality?
I don’t think we have a formalised organisational-level process around this; and I think this is just fairly general research appraisal stuff that we do independently. There’s a tradeoff between following a thorough process and speed; it might be clear on skimming that this study is much less updating because of its recruitment or allocation etc, but if we needed to e.g. MMAT every study we read this would be pretty time consuming. In general we try to transparently communicate what we’ve done in check-ins with each other, with our client, and in our reports, so they’re aware of limitations in the search and our conclusions.
Do you employ any audits or tools to identify/correct biases (e.g. what studies you select, whom you decide to interview, etc.)?
Can you give me an example of a tool to identify biases in the above? I assume you aren’t referring to tools that we can use to appraise individual studies/reviews but one level above that?
RE: interviews, one approach we frequently take is to look for key papers or reports in the field that are most likely to be decision-relevant and reach out to its author. Sometimes we will intentionally aim to find views that push us in opposing sides of the potential decision. Other times we just need technical expertise in an area that our team doesn’t have. Generally we will reach out to the client with the list to make sure they’re happy with the choices we’ve made, which is intended to reduce doubling up on the same expert, but also serves as a checkpoint I guess.
We don’t have audits but we do have internal reviews, though admittedly I think our current process is unlikely to pick up issues around interviewee selection unless the reviewer is well connected in this space, and it will similarly likely only pick up issues in study selection if the reviewer knows specific papers or have some strong priors around the existence of stronger evidence on this topic. My guess is that the likelihood of the audits making meaningful changes to our report is sufficiently low that if it takes more than a few days it just wouldn’t be worth the time for most of the reports we are doing. That being said, it might be a reasonable thing to consider as part of a separate retrospective review of previous reports etc! Do you have any suggestions here or are there good approaches you know about / have seen?
Our research process: an overview from Rethink Priorities’ Global Health and Development team
My concern is that if we sexually neuter all EA groups, meetings, and interactions, and sever the deep human motivational links between our mating effort and our intellectual and moral work, we’ll be taking the wind out of EA’s sails. We’ll end up as lonely, dispirited incels rowing our little boats around in circles, afraid to reach out, afraid to fall in love.
These are some pretty strong claims that don’t seem particularly well substantiated.
Is trying to be romantically attractive the “wrong reason” for doing excellent intellectual work, displaying genuine moral virtues, and being friendly at meetings?
I also feel a bit confused about this. I think if someone is taking a particular action, or “investing in difficult, challenging behaviors to attract mates”, it does seem clear there are contexts where the added intention of “to attract mates” changes how the interaction feels to me, and contexts where that added intention makes the interaction feel inappropriate. For example, if I’m at work and I think someone is friendly at the meeting because they primarily want to attract a mate vs if they are following professional norms vs if they’re a kind person who cares about fostering a welcoming space for discussion, I do consider some reasons better than others.
While I don’t think it’s wrong to try to attract mates at a general level, I think this can happen in ways that are deceitful, and ways that leverage power dynamics in a way that’s unfair and unpleasant (or worse) for the receiving party. In a similar vein, I particularly appreciated Dustin’s tweet here.
I do think International Women’s Day is a timely prompt for EA folks to celebrate and acknowledge the women in EA who are drawn to EA because they want to help find the best ways to help others, or to put them into practice. I appreciate (and am happy for you & Diana!) that there will be folks who benefit from finding like-minded mates in EA. I also agree that often there are overt actions that come with obvious social costs, and “going too far” in the other direction seems bad by definition. But I also want to recognise that sometimes there are likely actions that are not “overtly” costly, or may even be beneficial for those who are primarily motivated to attract mates, but may be costly in expectation for those who are primarily interested in EA as a professional space, or as a place where they can collaborate with people who also care about tackling some of the most important issues we face today. And I think this is a tradeoff that’s important to consider—ultimately the EA I want to see and be part of is one that optimises for doing good, and while that’s not mutually exclusive to trying to attract mates within EA, I’d be surprised if doing so as the primary goal also happened to be the best approach for doing good.
Both Kat and Emerson are claiming that there have been edits to this post.[1]
I wonder whether an appendix or summary of changes to important claims would be fair and appropriate, given the length of post and severity of allegations? It’d help readers keep up with these changes, and it is likely most time-efficient for the author making the edits to document these as they go along.
@Ben Pace
[Edit: Kat has since retracted her statement.]