Doctor from NZ, independent researcher (grand futures / macrostrategy) collaborating with FHI / Anders Sandberg. Previously: Global Health & Development research @ Rethink Priorities.
Feel free to reach out if you think there’s anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you’re a medical student / junior doctor reconsidering your clinical future, or if you’re quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.
Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and Oly-lifter.
All comments in personal capacity unless otherwise stated.
bruce
It sounds like you’re interpreting my claim to be “the Baird RCT is a particularly good proxy (or possibly even better than other RCTs on group therapy in adult women) for the SM adult programme effectiveness”, but this isn’t actually my claim here; and while I think one could reasonably make some different, stronger (donor-relevant) claims based on the discussions on the forum and the Baird RCT results, mine are largely just: “it’s an important proxy”, “it’s worth updating on”, and “the relevant considerations/updates should be easily accessible on various recommendation pages”. I definitely agree that an RCT on the adult programme would have been better for understanding the adult programme.
(I’ll probably check out of the thread here for now, but good chatting as always Nick! hope you’re well)
Yes, because:
1) I think this RCT is an important proxy for StrongMinds (SM)‘s performance ‘in situ’, and worth updating on—in part because it is currently the only completed RCT of SM. Uninformed readers who read what is currently on e.g. GWWC[1]/FP[2]/HLI website might reasonably get the wrong impression of the evidence base behind the recommendation around SM (i.e. there are no concerns sufficiently noteworthy to merit inclusion as a caveat). I think the effective giving community should have a higher bar for being proactively transparent here—it is much better to include (at minimum) a relevant disclaimer like this, than to be asked questions by donors and make a claim that there wasn’t capacity to include.[3]
2) If a SM recommendation is justified as a result of SM’s programme changes, this should still be communicated for trust building purposes (e.g. “We are recommending SM despite [Baird et al RCT results], because …), both for those who are on the fence about deferring, and for those who now have a reason to re-affirm their existing trust on EA org recommendations.[4]
3) Help potential donors make more informed decisions—for example, informed readers who may be unsure about HLI’s methodology and wanted to wait for the RCT results should not have to go search this up themselves or look for a fairly buried comment thread on a post from >1 year ago in order to make this decision when looking at EA recommendations / links to donate—I don’t think it’s an unreasonable amount of effort compared to how it may help. This line of reasoning may also apply to other evaluators (e.g. GWWC evaluator investigations).[5]- ^
GWWC website currently says it only includes recommendations after they review it through their Evaluating Evaluators work, and their evaluation of HLI did not include any quality checks of HLI’s work itself nor finalise a conclusion. Similarly, they say: “we don’t currently include StrongMinds as one of our recommended programs but you can still donate to it via our donation platform”.
- ^
Founders Pledge’s current website says:
We recommend StrongMinds because IPT-G has shown significant promise as an evidence-backed intervention that can durably reduce depression symptoms. Crucial to our analysis are previous RCTs
- ^
I’m not suggesting at all that they should have done this by now, only ~2 weeks after the Baird RCT results were made public. But I do think three months is a reasonable timeframe for this.
- ^
If there was an RCT that showed malaria chemoprevention cost more than $6000 per DALY averted in Nigeria (GDP/capita * 3), rather than per life saved (ballpark), I would want to know about it. And I would want to know about it even if Malaria Consortium decided to drop their work in Nigeria, and EA evaluators continued to recommend Malaria Consortium as a result. And how organisations go about communicating updates like this do impact my personal view on how much I should defer to them wrt charity recommendations.
- ^
Of course, based on HLI’s current analysis/approach, the ?disappointing/?unsurprising result of this RCT (even if it was on the adult population) would not have meaningfully changed the outcome of the recommendation, even if SM did not make this pivot (pg 66):
Therefore, even if the StrongMinds-specific evidence finds a small total recipient effect (as we present here as a placeholder), and we relied solely on this evidence, then it would still result in a cost-effectiveness that is similar or greater than that of GiveDirectly because StrongMinds programme is very cheap to deliver.
And while I think this is a conversation that has already been hashed out enough on the forum, I do think the point stands—potential donors who disagree with or are uncertain about HLI’s methodology here would benefit from knowing the results of the RCT, and it’s not an unreasonable ask for organisations doing charity evaluations / recommendations to include this information.
- ^
Based on Nigeria’s GDP/capita * 3
- ^
Acknowledging that this is DALYs not WELLBYs! OTOH, this conclusion is not the GiveWell or GiveDirectly bar, but a ~mainstream global health cost-effectiveness standard of ~3x GDP per capita per DALY averted (in this case, the ~$18k USD PPP/DALY averted of SM is below the ~$7k USD PPP/DALY bar for Uganda)
- ^
My view is that HLI[1], GWWC[2], Founders Pledge[3], and other EA / effective giving orgs that recommend or provide StrongMinds as an donation option should ideally at least update their page on StrongMinds to include relevant considerations from this RCT, and do so well before Thanksgiving / Giving Tuesday in Nov/Dec this year, so donors looking to decide where to spend their dollars most cost effectively can make an informed choice.[4]
- ^
Listed as a top recommendation
- ^
Not currently a recommendation, (but to included as an option to donate)
- ^
Currently tagged as an “active recommendation”
- ^
Acknowledging that HLI’s current schedule is “By Dec 2024”, though this may only give donors 3 days before Giving Tuesday.
- ^
Congratulations on the pilot!
I just thought I’d flag some initial skepticism around the claim:
Our estimates indicate that next year, we will become 20 times as cost-effective as cash transfers.
Overall I expect it may be difficult for the uninformed reader to know how much they should update based on this post (if at all), but given you have acknowledged many of these (fairly glaring) design/study limitations in the text itself, I am somewhat surprised the team is still willing to make the extrapolation from 7x to 20x GD within a year. It also requires that the team is successful with increasing effective outreach by 2 OOMs despite currently having less than 6 months of runway for the organisation.[1]
I also think this pilot should not give the team “a reasonable level of confidence that [the] adaptation of Step-by-Step was effective” insofar as the claim is that charitable dollars here are cost competitive with top GiveWell charities / have good reason to believe you will be 2x top GiveWell charities next year) (though perhaps you just meant from an implementation perspective, not cost-effectiveness). My current view is that while this might be a reasonable place to consider funding for non-EA funders (or e.g. specifically interested in mental health or mental health in India), I’d hope that the EA community who are looking to maximise impact through their donations in the GHD space would update based on higher evidentiary standards than what has been provided in this post, which IMO indicates little beyond feasibility and acceptability (which is still promising and exciting news, and I don’t want to diminsh this!)I don’t want this to come across as a rebuke of the work the team is trying to do—I am on the record for being excited about more people doing work that use subjective wellbeing on the margin, and I think this is work worth doing. But I hope the team is mindful that continued overconfident claims in this space may cause people to negatively update and less likely to fund this work in future, and for totally preventable communication-related decisions, and not because wellbeing approaches are bad/not worth funding in principle.
- ^
A very crude BOTEC based only on the increased time needed for the 15min / week calls with 10,000 people indicates something like 17 additional guides doing the 15min calls full time, assuming they do nothing but these calls every day. The increase in human resources to scale up to reaching 10,000 people are of course much more intensive than this, even for a heavily WhatsApp based intervention.
10000 * 0.25 * 6 * 0.27 / 40 / 6 = 16.875
(number reached * hours per week * weeks * retention / hours per week / week)
- ^
Hey Ben! A few quick Qs:
Did the team consider a paid/minimum wage position instead of an unpaid one? How did it decide on the unpaid positions?
Is the theory of change for impact here mainly an “upskill students/early career researchers” thing, or for the benefits to RP’s research outputs?
What is RP’s current policy on volunteers?
Does RP expect to continue recruiting volunteers for research projects in the future?
I think it is entirely possible that people are being unkind because they updated too quickly on claims from Ben’s post that are now being disputed, and I’m grateful that you’ve written this (ditto chinscratch’s comment) as a reminder to be empathetic. That being said, there are also some reasons people might be less charitable than you are for reasons that are unrelated to them being unkind, or the facts that are in contention:
I have only heard good things about Nonlinear, outside these accusations
Right now, on the basis of what could turn out to have been a lot of lies, their reputations, friendship futures and careers are at risk of being badly damaged
Without commenting on whether Ben’s original post should have been approached better or worded differently or was misleading etc, this comment from the Community Health/Special Projects team might add some useful additional context. There are also previous allegations that have been raised.[1]
Perhaps you are including both of these as part of the same set of allegations, but some may suggest that not being permitted to run sessions / recruit at EAGs and considering blocking attendance (especially given the reference class of actions that have prompted various responses that you can see here) is qualitatively important and may affect whether commentors are being charitable or not (as opposed to if they just considered the contents of Ben’s post VS Nonlinear (NL)’s response). Of course, this depends on how much you think the Community Health/Special Projects team are trustworthy with their judgement / investigation, or how likely this is all just an information cascade etc.
It seems reasonable to assume that the people at Nonlinear are altruistic people.
It is possible for altruistic people to be poor managers, poor leaders, make bad decisions about professional boundaries, have a poor understanding of power dynamics, or indeed, be abusive. The extent to which people at NL are altruistic is (afaict) not a major point of contention, and it is possible to not update about how altruistic someone is while also wanting to hold them accountable to some reasonable standard like “not being abusive or manipulative towards people you manage”.
Instead, as I see it, the main, or at least most upvoted, response here has been to critique stylistic mistakes made in their almost impossible task of refuting very damaging claims from anonymous sources in unknown contexts.
The claims in question from Alice/Chloe/Ben are not anonymous, the identities of Alice and Chloe are known to the Nonlinear team.
Independent of my personal views on these issues, I do think the pushback around ‘stylistic mistakes’ are reasonable insofar as people interpret this to be indicative of something concerning about NL’s approach towards managing staff / criticism / conflict (1, 2, 3), rather than e.g. just being nitpicky about tone, though I appreciate both interpretations are plausible.
I’d like people to imagine what they would do in a similar situation if they were faced with similar accusations. How would you successfully persuade people that you didn’t do the things you were accused of, and that the context was not as portrayed?
I think (much) less is more in this case.[2] I think there are parts of this current post that feel more subjective and not supported by facts, and may be reasonably interpreted by a cynical outsider to look like a distraction or a defensive smear campaign. I think these choices are counterproductive (both for a truth-seeking outsider, and for NL’s own interests), especially given the allegations of frame control and being retaliatory.
There are other parts that might similarly be reasonably interpreted to range from irrelevant (Alice’s personal drug use habits), unproductive (links to Kathy Forth), or misleading (inclusion of photos, inconsistent usage of quotation marks, unnecessary paraphrasing, usage of quotes that miss the full context). I disagreed with the approaches here, though I acknowledge there were competing opinions and I wasn’t privy to the internal discussions that lead to the decisions.
I think a cleaner version of this would have probably been something 5 to 10x shorter (not including the appendix), and looked something like:[3]
Apology for harms done
Acknowledgement of which allegations are seen as the most major (much closer to top 3-5 than all 85)
Responses to major allegations, focusing only on factual differences and claims that are backed up by ~irrefutable evidence
Charitable interpretations of Alice/Chloe/Ben’s position, despite above factual disagreement (what kinds of things need to be true for their allegations to be plausibly reasonable or fair from their perspective),
Lessons learnt, and things NL will do differently in future (some expression of self-awareness / reflection)
An appendix containing a list of unresolved but less critical allegations
Disclaimer: I offered to (and did) help review an early draft, in large part because I expected the NL team to (understandably!) be in panic mode after Ben’s post/getting dogpiled, and I wanted further community updates to be based on as much relevant information as was possible.- ^
This footnote added in response to Jeff’s comment: I agree that it’s likely not double counting, because the story there appears to be one where Kat left the working relationship, which is inconsistent with the accounts of Alice / Chloe’s situations, but also makes it unlikely that the “current employee of NL / Kat” hypothesis is correct.
- ^
Perhaps hypocritical given the length of this comment
- ^
Acknowledging that I have no PR expertise
Can you assure me that Rethink’s researchers are independent?.
I no longer work at RP, but I thought I’d add a data point from someone who doesn’t stand to benefit from your donations, in case it was helpful.
I think my take here is that if my experience doing research with the GHD team is representative of RP’s work going forwards, then research independence should not be a reason not to donate.[1]
My personal impression is that of the work that I / the GHD team has been involved with, I have been afforded the freedom to look for our best guess of what the true answers are, and have personally never felt constrained or pushed into a particular answer that wasn’t directly related to interpretation of the research. I have also consistently felt free to push back on lines of research that I feel would be less productive, or suggest stronger alternatives. I think credit here probably goes both to clients as well as the GHD team, though I’m not sure exactly how to attribute this.
I feel less confident about biases that may arise from the research agenda / selection of research questions or worldviews and assumptions of clients, but this could (for example) make one more inclined towards funding RP to do their own independent research, or specifying research you think is particularly important and neglected.
Edit: See thread by Saulius detailing his views.- ^
Caveats: I can’t speak for the teams outside of GHD, and I can’t speak for RP’s work in 2024. This comment should not be seen as an endorsement of the claim that RP is the best place to donate to all things considered, which obviously is influenced by other variables beyond research independence.
- ^
Evidentiary standards. We drew on a large number of RCTs for our systematic reviews and meta-analyses of cash transfers and psychotherapy (42 and 74, respectively). If one holds that the evidence for something as well-studied as psychotherapy is too weak to justify any recommendations, charity evaluators could recommend very little.
A comparatively minor point, but it doesn’t seem to me that the claims in Greg’s post [more] are meaningfully weakened by whether or not psychotherapy is well-studied (as measured by how many RCTs HLI has found on it, noting that you already push back on some object level disagreement on study quality in point 1, which feels more directly relevant).
It also seems pretty unlikely to be true that psychotherapy being well studied necessarily means that StrongMinds is a cost-effective intervention comparable to current OP / GW funding bars (which is one main point of contention), or that charity evaluators need 74+ RCTs in an area before recommending a charity. Is the implicit claim being made here is that the evidence for StrongMinds being a top charity is stronger than that of AMF, which is (AFAIK) based on less than 74 RCTs?[1]
I never worked directly with Meghan when we were colleagues, but my interactions with her were v positive and give me the impression that she would be a great supervisor to work with—infectiously passionate about her research, an excellent communicator, and kind + supportive.
This sounds like a terribly traumatic experience. I’m so sorry you went through this, and I hope you are in a better place and feel safer now.
Your self-worth is so, so much more than how well you can navigate what sounds like a manipulative, controlling, and abusive work environment.
spent months trying to figure out how to empathize with Kat and Emerson, how they’re able to do what they’ve done, to Alice, to others they claimed to care a lot about. How they can give so much love and support with one hand and say things that even if I’d try to model “what’s the worst possible thing someone could say”, I’d be surprised how far off my predictions would be.
It sounds like despite all of this, you’ve tried to be charitable to people who have treated you unfairly and poorly—while this speaks to your compassion, I know this line of thought can often lead to things that feel like you are gaslighting yourself, and I hope this isn’t something that has caused you too much distress.
I also hope that Effective Altruism as a community becomes a safer space for people who join it aspiring to do good, and I’m grateful for your courage in sharing your experiences, despite it (very reasonably!) feeling painful and unsafe for you.[1] All the best for whatever is next, and I hope you have access to enough support around you to help with recovering what you’ve lost.
============
[Meta: I’m aware that there will likely be claims around the accuracy of these stories, but I think it’s important to acknowledge the potential difficulty of sharing experiences of this nature with a community that rates itself highly on truth-seeking, possibly acknowledging your own lived experience as “stories” accordingly; as well as the potential anguish it might be for these experiences to have been re-lived over the past year and possibly again in the near future, if/when these claims are dissected, questioned, and contested.]
- ^
That being said, your experience would be no less valid had you chosen not to share these. And even though I’m cautiously optimistic that the EA community will benefit from you sharing these experiences, your work here is supererogatory, and improving Nonlinear’s practices or the EA community’s safety is not your burden to bear alone. In a different world it would have been totally reasonable for you to not have shared this, if that was what you needed to do for your own wellbeing. I guess this comment is more for past Chloes or other people with similar experiences who may have struggled with these kinds of decisions than it is for Chloe today, but thought it was worth mentioning.
- ^
Both Kat and Emerson are claiming that there have been edits to this post.[1]
I wonder whether an appendix or summary of changes to important claims would be fair and appropriate, given the length of post and severity of allegations? It’d help readers keep up with these changes, and it is likely most time-efficient for the author making the edits to document these as they go along.
[Edit: Kat has since retracted her statement.]
- ^
his original post (he’s quietly redacted a lot of points since publishing) had a lot of falsehoods that he knew were not true. He has since removed some of them after the fact, but those have still been causing us damage.Ben has also been quietly fixing errors in the post, which I appreciate, but people are going around right now attacking us for things that Ben got wrong, because how would they know he quietly changed the post?
This is why every time newspapers get caught making a mistake they issue a public retraction the next day to let everyone know. I believe Ben should make these retractions more visible
- ^
To add sources to recent examples that come to mind that broadly support MHR’s point above RE: visible (ex post) failures that don’t seem to be harshly punished, (most seem somewhere between neutral to supportive, at least publicly).
Lightcone
Alvea
ALERT
AI Safety Support
EA hub
No Lean SeasonSome failures that came with a larger proportion of critical feedback probably include the Carrick Flynn campaign (1, 2, 3), but even here “harshly punish” seems like an overstatement. HLI also comes to mind (and despite highly critical commentary in earlier posts, I think the highly positive response to this specific post is telling).
============
On the extent to which Nonlinear’s failures relate to integrity / engineering, I think I’m sympathetic to both Rob’s view:
I think the failures that seem like the biggest deal to me (Nonlinear threatening people and trying to shut down criticism and frighten people) genuinely are matters of character and lack of integrity, not matters of bad engineering.
As well as Holly’s:
If you wouldn’t have looked at it before it imploded and thought the engineering was bad, I think that’s the biggest thing that needs to change. I’m concerned that people still think that if you have good enough character (or are smart enough, etc), you don’t need good boundaries and systems.
but do not think these are necessarily mutually exclusive.
Specifically, it sounds like Rob is mainly thinking about the source of the concerns, and Holly is thinking about what to do going forwards. And it might be the case that the most helpful actionable steps going forward are things that look more like improving boundaries and systems, regardless of whether you believe failures specific to Nonlinear are caused by deficiencies in integrity or engineering.That said, I agree with Rob’s point that the most significant allegations raised about Nonlinear quite clearly do not fit the category of ‘appropriate experimentation that the community would approve of’, under almost all reasonable perspectives.
I was a participant and largely endorse this comment.
one contributor to a lack of convergence was attrition of effort and incentives. By the time there was superforecaster-expert exchange, we’d been at it for months, and there weren’t requirements for forum activity (unlike the first team stage)
[Edit: wrote this before I saw lilly’s comment, would recommend that as a similar message but ~3x shorter].
============
I would consider Greg’s comment as “brought up with force”, but would not consider it an “edge case criticism”. I also don’t think James / Alex’s comments are brought up particularly forcefully.
I do think it is worth making a case that pushing back on making comments that are easily misinterpreted or misleading are also not edge case criticisms though, especially if these are comments that directly benefit your organisation.
Given the stated goal of the EA community is “to find the best ways to help others, and put them into practice”, it seems especially important that strong claims are sufficiently well-supported, and made carefully + cautiously. This is in part because the EA community should reward research outputs if they are helpful for finding the best ways to do good, not solely because they are strongly worded; in part because EA donors who don’t have capacity to engage at the object level may be happy to defer to EA organisations/recommendations; and in part because the counterfactual impact diverted from the EA donor is likely higher than the average donor.
For example:
“We’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money”.[1]
Michael has expressed regret about this statement, so I won’t go further into this than I already have. However, there is a framing in that comment that suggests this is an exception, because “HLI is quite well-caveated elsewhere”, and I want to push back on this a little.
HLI has previously been mistaken for an advocacy organisation (1, 2). This isn’t HLI’s stated intention (which is closer to a “Happiness/Wellbeing GiveWell”). I outline why I think this is a reasonable misunderstanding here (including important disclaimers that outline HLI’s positives).
Despite claims that HLI does not advocate for any particular philosophical view, I think this is easily (and reasonably) misinterpreted.
James’ comment thread below: “Our focus on subjective wellbeing (SWB) was initially treated with a (understandable!) dose of scepticism. Since then, all the major actors in effective altruism’s global health and wellbeing space seem to have come around to it”
See alex’s comment below, where TLYCS is quoted to say: “we will continue to rely heavily on the research done by other terrific organizations in this space, such as GiveWell, Founders Pledge, Giving Green, Happier Lives Institute [...]”
I think excluding “to identify candidates for our recommendations, even as we also assess them using our own evaluation framework” [emphasis added] gives a fairly different impression to the actual quote, in terms of whether or not TLYCS supports WELLBYs as an approach.
While I wouldn’t want to exclude careless communication / miscommunication, I can understand why others might feel less optimistic about this, especially if they have engaged more deeply at the object level and found additional reasons to be skeptical.[2] I do feel like I subjectively have a lower bar for investigating strong claims by HLI than I did 7 or 8 months ago.
(commenting in personal capacity etc)
============
Adding a note RE: Nathan’s comment below about bad blood:
Just for the record, I don’t consider there to be any bad blood between me and any members of HLI. I previously flagged a comment I wrote with two HLI staff, worrying that it might be misinterpreted as uncharitable or unfair. Based on positive responses there and from other private discussions, my impression is that this is mutual.[3]- ^
-This as the claim that originally prompted me to look more deeply into the StrongMinds studies. After <30 minutes on StrongMinds’ website, I stumbled across a few things that stood out as surprising, which prompted me to look deeper. I summarise some thoughts here (which has been edited to include a compilation of most of the critical relevant EA forum commentary I have come across on StrongMinds), and include more detail here.
-I remained fairly cautious about claims I made, because this entire process took three years / 10,000 hours, so I assumed by default I was missing information or that there was a reasonable explanation.
-However, after some discussions on the forum / in private DMs with HLI staff, I found it difficult to update meaningfully towards believing this statement was a sufficiently well-justified one. I think a fairly charitable interpretation would be something like “this claim was too strong, it is attributable to careless communication, but unintentional.”
- ^
Quotes above do not imply any particular views of commentors referenced.
- ^
I have not done this for this message, as I view it as largely a compilation of existing messages that may help provide more context.
- 11 Jul 2023 8:48 UTC; 1 point) 's comment on The Happier Lives Institute is funding constrained and needs you! by (
A commonly used model in the trust literature (Mayer et al., 1995) is that trustworthiness can be broken down into three factors: ability, benevolence, and integrity.
RE: domain specific, the paper incorporates this under ‘ability’:
The domain of the ability is specific because the trustee may be highly competent in some technical area, affording that person trust on tasks related to that area. However, the trustee may have little aptitude, training, or experience in another area, for instance, in interpersonal communication. Although such an individual may be trusted to do analytic tasks related to his or her technical area, the individual may not be trusted to initiate contact with an important customer. Thus, trust is domain specific.
There are other conceptions but many of them describe something closer to trust that is domain specific rather than generalised.
...All of these are similar to ability in the current conceptualization. Whereas such terms as expertise and competence connote a set of skills applicable to a single, fixed domain (e.g., Gabarro’s interpersonal competence), ability highlights the task- and situation-specific nature of the construct in the current model.
This is a conversation I have a fair amount when I talk to non-EA + non-medical friends about work, some quick thoughts:
If someone asks me Qs around DALYs at all (i.e. “why measure”), I would point to general cases where this happens fairly uncontroversially, e.g.:-If you were in charge of the health system, how would you choose to distribute the resources you get?
-If you were building a hospital, how would you go about choosing how to allocate your wards to different specialties?
-If you were in an emergency waiting room and you had 10 people in the waiting room, how would you choose who to see first?
These kinds of questions entail some kind of “diverting resources from one person to another” in a way that is pretty understandable (though they also point to reasonable considerations for why you might not only use DALYs in those contexts)
If someone is challenging me over using DALYs in context of it being a measurement system that is potentially ableist, then I generally just agree—it is indeed ableist by some framings![1]
Though, often in these conversations the underlying theme isn’t necessarily a “I have a problem with healthcare prioritisation” but a general sense that disabled folk aren’t receiving enough resources for their needs—so when having these conversations it’s important to acknowledge that disabled folk do just face a lot more challenges navigating the healthcare system (and society generally) through no fault of their own, and that we haven’t worked out the answers to prioritising accordingly or for solving the barriers that disabled folk face.
If the claim goes further and is explicitly saying interventions for disabilities are more cost effective than current DALYs approach give them credit for, then that’s also worth considering—though the standard would correspondingly increase if they are suggesting a new approach to resource allocation—as Larks’ comment illustrates, it is difficult to find an singular approach / measure that doesn’t push against intuitions or have something problematic at the policy level.[2]
On how you’re feeling when talking about prioritising:But then I feel like I’m implicitly saying something about valuing some people’s lives less than others, or saying that I would ultimately choose to divert resources from one person’s suffering to another’s.
This makes sense, though I do think there is a decent difference between the claim of “some people’s lives are worth more than others” and the claim of “some healthcare resources go further in one context than others (and thus justify the diversion)”. For example, I think if you never actively deprioritised anyone you would end up implicitly/passively prioritising based on things like [who can afford to go to the hospital / who lives closer / other access constraints]. But these are going to be much less correlated to what people care about when they say “all lives are equal”.
But if we have data on what the status quo is, then “not prioritising” / “letting the status quo happen” is still a choice we are making! And so we try to improve on the status quo and save more lives, precisely because we don’t think the 1000 patients on diabetes medication is worth less than the one cancer patient on a third-line immunotherapy.
- ^
E.g., for DALYs, the disability weight of 1 person with (condition A+B) is mathematically forced to be lower than the combined disability weight of two separate individuals with condition A and condition B respectively. That means for any cure of condition A, those who have only condition A would theoretically be prioritised under the DALY framework than those who have other health issues (e.g. have a disability). While I don’t have a good sense of when/if this specific part of the DALY framework has impacted resource allocation in practice, it is important to acknowledge the (many!) limitations the measures we use have.
- ^
Also, different folks within the disability community also have a wide range of views around what it means to live with a disability / be a disabled person (e.g. functional VS social models of disability), so it’s not actually clear that e.g., WELLBYs would necessarily lead to more healthcare resources in that direction, depending on which groups you were talking to.
- ^
Thanks for writing this! RE: We would advise against working at Conjecture
We think there are many more impactful places to work, including non-profits such as Redwood, CAIS and FAR; alignment teams at Anthropic, OpenAI and DeepMind; or working with academics such as Stuart Russell, Sam Bowman, Jacob Steinhardt or David Krueger. Note we would not in general recommend working at capabilities-oriented teams at Anthropic, OpenAI, DeepMind or other AGI-focused companies.
Additionally, Conjecture seems relatively weak for skill building [...] We expect most ML engineering or research roles at prominent AI labs to offer better mentorship than Conjecture. Although we would hesitate to recommend taking a position at a capabilities-focused lab purely for skill building, we find it plausible that Conjecture could end up being net-negative, and so do not view Conjecture as a safer option in this regard than most competing firms.
I don’t work in AI safety and am not well-informed on the orgs here, but did want to comment on this as this recommendation might benefit from some clarity about who the target audience is.
As written, the claims sound something like:
CAIS et al., alignment teams at Anthropic et al., and working with Stuart Russel et al., are better places to work than Conjecture
Though not necessarily recommended, capabilities research at prominent AI labs is likely to be better than working at Conjecture for skill building, since Conjecture is not necessarily safer.
However:
The suggested alternatives don’t seem like they would be able to absorb a significant amount of additional talent, especially given the increase in interest in AI.
I have spoken to a few people working in AI / AI field building who perceive mentoring to be a bottleneck in AI safety at the moment.
If both of the above are true, what would your recommendation be to someone who had an offer from Conjecture, but not your recommended alternatives? E.g., choosing between independent research funded by LTFF VS working for Conjecture?
Just seeking a bit more clarity about whether this recommendation is mainly targeted at people who might have a choice between Conjecture and your alternatives, or whether this is a blanket recommendation that one should reject offers from Conjecture, regardless of seniority and what their alternatives are, or somewhere in between.
Thanks again!
Some very quick thoughts from EY’s TIME piece from the perspective of someone ~outside of the AI safety work. I have no technical background and don’t follow the field closely, so likely to be missing some context and nuance; happy to hear pushback!
Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
My immediate reaction when reading this was something like “wow, is this representative of AI safety folks? Are they willing to go to any lengths to stop AI development?”. I’ve heard anecdotes of people outside of all this stuff saying this piece reads like a terrorist organisation, for example, which I think is a stronger term than I’d describe, but I think suggestions like this does unfortunately play into potential comparisons to ecofascists.
As someone seen publicly to be a thought leader and widely regarded as a founder of the field, there are some risks to this kind of messaging. It’s hard to evaluate how this trades off, but I definitely know communities and groups that would be pretty put off by this, and it’s unclear how much value the sentences around willingness to escalate nuclear war are actually adding.
It’s an empirical Q about how to tradeoff between risks from nuclear war and risks from AI, but the claim of “preventing AI extinction is a priority above a nuclear exchange” is ~trivially true; the reverse is also true: “preventing extinction from nuclear war is a priority above preventing AI training runs”. Given the difficulty of illustrating and defending a position that the risks of AI training runs is substantially higher than that of a nuclear exchange to the general public, I would have erred on the side of caution when saying things that are as politically charged as advocating for nuclear escalation (or at least something can be interpreted as such).
I wonder which superpower EY trusts to properly identify a hypothetical “rogue datacentre” that’s worthy of a military strike for the good of humanity, or whether this will just end up with parallels to other failed excursions abroad ‘for the greater good’ or to advance individual national interests.
If nuclear weapons are a reasonable comparison, we might expect limitations to end up with a few competing global powers to have access to AI developments, and countries that do not. It seems plausible that criticism around these treaties being used to maintain the status quo in the nuclear nonproliferation / disarmament debate may be applicable here too.
Unlike nuclear weapons (though nuclear power may weaken this somewhat), developments in AI has the potential to help immensely with development and economic growth.
Thus the conversation may eventually bump something that looks like:
Richer countries / first movers that have obtained significant benefits of AI take steps to prevent other countries from catching up.[1]
Rich countries using the excuse of preventing AI extinction as a guise to further national interests
Development opportunities from AI for LMICs are similarly hindered, or only allowed in a way that is approved by the first movers in AI.
Given the above, and that conversations around and tangential to AI risk already receive some pushback from the Global South community for distracting and taking resources away from existing commitments to UN Development Goals, my sense is that folks working in AI governance / policy would likely strongly benefit from scoping out how these developments are affecting Global South stakeholders, and how to get their buy-in for such measures.
(disclaimer: one thing this gestures at is something like—“global health / development efforts can be instrumentally useful towards achieving longtermist goals”[2], which is something I’m clearly interested in as someone working in global health. While it seems rather unlikely that doing so is the best way of achieving longtermist goals on the margin[3], it doesn’t exclude some aspect of this in being part of a necessary condition for important wins like an international treaty, if that’s what is currently being advocated for. It is also worth mentioning because I think this is likely to be a gap / weakness in existing EA approaches).
In our new report, The Elephant in the Bednet, we show that the relative value of life-extending and life-improving interventions depends very heavily on the philosophical assumptions you make. This issue is usually glossed over and there is no simple answer.
We conclude that the Against Malaria Foundation is less cost-effective than StrongMinds under almost all assumptions. We expect this conclusion will similarly apply to the other life-extending charities recommended by GiveWell.
In suggesting James quote these together, it sounds like you’re saying something like “this is a clear caveat to the strength of recommendation behind StrongMinds, HLI doesn’t recommend StrongMinds as strongly as the individual bullet implies, it’s misleading for you to not include this”.
But in other places HLI’s communication around this takes on a framing of something closer to “The cost effectiveness of AMF, (but not StrongMinds) varies greatly under these assumptions. But the vast majority of this large range falls below the cost effectiveness of StrongMinds”. (extracted quotes in footnote)[1]
As a result of this framing, despite the caveat that HLI “[does] not advocate for any particular view”, I think it’s reasonable to interpret this as being strongly supportive of StrongMinds, which can be true even if HLI does not have a formed view on the exact philosophical view to take.[2]
If you did mean the former (that the bullet about philosophical assumptions is primarily included as a caveat to the strength of recommendation behind StrongMinds), then there is probably some tension here between (emphasis added):
-”the relative value of life-extending and life-improving interventions depends very heavily on the philosophical assumptions you make...there is no simple answer”, and
-”We conclude StrongMinds > AMF under almost all assumptions”
Additionally I think some weak evidence to suggest that HLI is not as well-caveated as it could be is that many people (mistakenly) viewed HLI as an advocacy organisation for mental health interventions. I do think this is a reasonable outside interpretation based on HLI’s communications, even though this is not HLI’s stated intent. For example, I don’t think it would be unreasonable for an outsider to read your current pinned thread and come away with conclusions like:
“StrongMinds is the best place to donate”,
“StrongMinds is better than AMF”,
“Mental health is a very good place to donate if you want to do the most good”,
“Happiness is what ultimately matters for wellbeing and what should be measured”.
If these are not what you want people to take away, then I think pointing to this bullet point caveat doesn’t really meaningfully address this concern—the response kind of feels something like “you should have read the fine print”. While I don’t think it’s not necessary for HLI to take a stance on specific philosophical views, I do think it becomes an issue if people are (mis)interpreting HLI’s stance based on its published statements.
(commenting in personal capacity etc)
- ^
-We show how much cost-effectiveness changes by shifting from one extreme of (reasonable) opinion to the other. At one end, AMF is 1.3x better than StrongMinds. At the other, StrongMinds is 12x better than AMF.
-StrongMinds and GiveDirectly are represented with flat, dashed lines because their cost-effectiveness does not change under the different assumptions.
-As you can see, AMF’s cost-effectiveness changes a lot. It is only more cost-effective than StrongMinds if you adopt deprivationism and place the neutral point below 1.
- ^
As you’ve acknowledged, comments like “We’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money.” perhaps add to the confusion.
- 10 Jul 2023 22:22 UTC; 45 points) 's comment on The Happier Lives Institute is funding constrained and needs you! by (
- 11 Jul 2023 8:48 UTC; 1 point) 's comment on The Happier Lives Institute is funding constrained and needs you! by (
Would you be happy to expand on these points?