[comment I’m likely to regret writing; still seems right]
It seems lot of people are reacting by voting, but the karma of the post is 0. It seems to me up-votes and down-votes are really not expressive enough, so I want to add a more complex reaction.
It is really very unfortunate that the post is framed around the question whether Will MacAskill is or is not honest. This is wrong, and makes any subsequent discussion difficult. (strong down-vote) (Also the conclusion (“he is not”) is not really supported by the evidence.)
It is (and was even more in the blog version) over-zealous, interpreting things uncharitably, and suggesting extreme actions. (downvote)
At the same time, it seems really important to have an open and critical discussion, and culture where people can challenge ‘canonical’ EA books and movement leaders. (upvote)
Carefully going through the sources and checking if papers are not cherry-picked and represented truthfully is commendable. (upvote)
Having really good epistemics is really important, in particular with the focus on long-term. Vigilance in this direction seems good. (upvote)
So it seems really a pity the post was not framed as a question somewhere in the direction “do you thing this is epistemically good?”
If I try to imagine something like “steel-maned version of the post”, without questioning honesty, and without making uncharitable inferences, the reaction could have been some useful discussion.
It seems to me
“Doing Good Better” is sometimes more on the “explaining & advocacy of ideas” side than “dispassionate representation of research”.
Given the genre, I would bet the book is in the top quartile on the metric of representing research correctly.
In some of the examples, it seems adding more caveats and reporting in more detail would have been better for readers interested in precision. Likely at the cost of making the text more dry.
Some emotions sometimes creep in: In case of the somewhat uncharitable part about Charity Evaluator, I remembered their much more uncharitable / misrepresenting text attacking effective altruism and GiveWell. Also while they talk about importance of other things, what they actually measure is actually wrong, and criticized correctly. In case of the whole topic… well a lot of evidence points toward things like 100x multiplier being true, meaning that yes, actually it is possible to save many more people. It seems hard to not to have some passion.
Given that several books about long-term future are now written, the update I would take from that is the books mostly about long-term should err more the side of caveating and describing disagreements and explaining uncertainty more, but my feeling is shift in this direction already happened between 2014 and 2018.
I agree with all the points you make here, including on the suggested upvote/downvote distribution, and on the nature of DGB. FWIW, my (current, defeasible) plan for any future trade books I write is that they’d be more highbrow (and more caveated, and therefore drier) than DGB.
I think that’s the right approach for me, at the moment. But presumably at some point the best thing to do (for some people) will be wider advocacy (wider than DGB), which will inevitably involve simplification of ideas. So we’ll have to figure out what epistemic standards are appropriate in that context (given that GiveWell-level detail is off the table).
Some preliminary thoughts on heuristics for this (these are suggestions only):
Standards we’d want to keep as high as ever:
Is the broad brush strokes picture of what is being conveyed accurate? Is there any easy way the broad brush of what is conveyed could have been made more accurate?
Are the sentences being used to support this broad brush strokes picture warranted by the evidence?
Is this the way of communicating the core message about as caveated and detailed as one can reasonably manage?
Standards we’d need to relax:
Does this communicate as much detail as possible with respect to the relevant claims?
Does this communicate all the strongest possible counterarguments to the key claim?
Thanks. I think the criteria which standards to keep and which to relax you propose are reasonable.
It seems an important question. I would like someone trying it to study more formally, using for example “value of information” or “rational inattention” frameworks. I can imagine experiments like giving people a longer list of arguments and trying to gather feedback on what was the value for them and then making decisions based on that. (Now this seems to be done mainly based on author’s intuitions.)
In some of the examples, it seems adding more caveats and reporting in more detail would have been better for readers interested in precision.
I should point out that in the post I show not just a lack of caveats and details. William misrepresents the evidence. Among other things, he:
cherry picks the variables from a deworming paper he cites
interprets GW’s AMF estimate in a way they specifically asked not to interpret them (“five hundred times” more effective thing — Holden wrote specifically about such arguments that they seem to require taking cost-effectiveness estimates literally)
quotes two sentences from Charity Navigator’s site when the very next sentence shows that the interpretation of the previous sentences is wrong
In a long response William posted here, he did not address any of these points:
he doesn’t mention cherry picking (and neither does his errata page)
he doesn’t mention the fact that GiveWell asked not to interpret their AMF estimate literally
and he writes “I represent CN fairly, and make a fair criticism of its approach to assessing charities.”, which may be true about some general CN’s position, but which has nothing to do with misquoting Charity Navigator.
If the issue was just a lack of detail, of course I would not have written the post in such a tone. Initially, I considered simply emailing him a list of mistakes that I found, but as I mentioned in the post, the volume and egregiousness of misrepresentations lead me to conclude that he argued in bad faith.
edit: I will email GiveWell to clarify what they think about William making claims about 500 times more benefit on the basis of their AMF estimate.
I think I understand you gradually become upset, but it seems in the process you started to miss the more favorable interpretations.
For example, with the “interpretation of the GiveWell estimates”: based on reading a bunch of old discussions on archive, my _impression_ is there was at least in some point of time a genuine disagreement about how to interpret the numbers between Will, Tobi, Holden and possibly others (there was much less disagreement about the numeric values). So if this is the case, it is plausible Will was using his interpretation of the numbers, which was in some sense “bolder” than the GW interpretation. My sense of good epistemic standard is you certainly can do this, but should add a caveat with warning that the authors of the numbers have a different interpretation of them (so it is a miss of caveat). At the same time I can imagine how you can fail to do this without any bad faith—for example, if you are at some point of the discussion confused whether some object-level disagreement continues or not (especially if you ask the other party in the disagreement to check the text). Also, if my impression is correct and the core of the object-level disagreement was quite technical question regarding proper use of Bayesian statistics and EV calculations, it does not seem obvious how to report the disagreement to general public.
In general: switching to the assumption someone is deliberately misleading is a highly slippery slope: it seems with this sort of assumption you can kind of explain everything, often easily, and if you can’t e.g. speak to people in person it may be quite difficult to find anything which would make you the update in the opposite direction.
About cost-effectiveness estimates: I don’t think your interpretation is plausible. The GiveWell page that gives the $3400 estimate, specifically asks not to interpret it literally.
About me deciding that MacAskill is deliberately misleading. Please see my comment in /r/slatestarcodex in response to /u/scottalexander about it. Would love to know what you think.
[because of time constrains, I will focus on just one example now]
Yes, but GiveWell is not some sort of ultimate authority on how their numbers should be interpreted. Take an ab absurdum example: NRA publishes some numbers about guns, gun-related violence, and their interpretation that there are not enough guns in the US and gun violence is low. If you basically agree with numbers, but disagree with their interpretation, surely you can use the numbers and interpret them in a different way.
GiveWell reasoning is explained in this article. Technically speaking you _can_ use the numbers directly as EV estimates if you have a very broad prior, and the prior is all the same across all the actions you are comparing. (You can argue this is technically not the right thing to do, or you can argue that GiveWell advises people not to do it.) As I stated in my original comment, I’d appreciate if such disagreements are reported. At the same time it seems difficult to do it properly in a popular text. I can imagine something like this
According to the most rigorous estimates by GiveWell, the cost to save a life in the developing world is about $3,400 (or $100 for one QALY [Quality-adjusted life year]). However, this depends on a literal interpretation of the numbers, which GiveWell does not recommend. But if you start with a very broad prior distribution over action impacts, uniform across actions, even if you use the correct Bayesian statistics, the mean expectation value of the cost will be the number we use (we can see that from the estimate being unbiased ). About $3,400 is a small enough amount that most of us in affluent countries could donate that amount every year while maintaining about the same quality of life. …
being more precise, but you can probably see it is a very different book now. I’d be quite interested in how you would write the paragraph if you wanted to use the number, wanted to give numerical estimate of the cost per live saved and did not want to explain to the reader Bayesian estimates.
Guzey, would you consider rewriting this post, framing it not as questioning MacAskill’s honesty but rather just pointing out some flaws in the representation of research? I fully buy some of your criticisms (it was an epistemic failure to not report that deworming has no effect on test scores, misrepresent Charity Navigator’s views, and misrepresent the “ethical employer” poll). And I think Jan’s views accurately reflect the community’s views: we want to be able to have open discussion and criticism, even of the EA “canon.” But it’s absolutely correct that the personal attacks on MacAskill’s integrity make it near impossible to have this open discussion.
Even if you’re still convinced that MacAskill is dishonest, wouldn’t the best way to prove it to the community be to have a thorough, open debate over these factual question? Then, if it becomes clear that your criticisms are correct, people will be able to judge the honesty issue themselves. I think you’re limiting your own potential here by making people not want to engage with your ideas.
I’d be happy to engage with the individual criticisms here and have some back and forth, if only this was written in a less ad hominem way.
Separately, does anyone have thoughts on the John Bunker DALY estimate? MacAskill claims that a developed world doctor only creates 7 DALYs, Bunker’s paper doesn’t seem to say anything like this, and this 80,000 Hours blog estimates instead that a developed world doctor creates 600 QALYs. Was MacAskill wrong on the effectiveness of becoming a doctor?
I do wonder if I should’ve written this post in a less personal tone. I will consider writing a follow up to it.
About me deciding that MacAskill is deliberately misleading, please see my comment in /r/slatestarcodex in response to /u/scottalexander about it. Would love to know what you think.
I’ll headline this by saying that I completely believe you’re doing this in good faith, I agree with several of your criticisms, and I think this deserves to be openly discussed. But I also strongly disagree with your conclusion about MacAskill’s honesty, and, even if I thought it was plausible, it still would be an unnecessary breach of etiquette that makes open conversation near impossible. I really think you should stop making this an argument about MacAskill’s personal honesty. Have the facts debate, leave ad hominem aside so everyone can fully engage, and if you’re proven right on the facts, then raise your honesty concerns.
First I’d like to address your individual points, then your claims about MacAskill.
Misreporting the deworming study. I think this is your best point. It seems entirely correct that if textbooks fail because they don’t improve test scores, that deworming should fail by the same metric. But I agree with /u/ScottAlexander that, in popular writing, you often don’t have the space to specifically go through all the literature on why deworming is better. MacAskill’s deworming claims were misleading on one level, in that the specific argument he provided is not a good one, but also fair on another level: MacAskill/GiveWell has looked tons into deworming, concluded that it’s better than textbooks, and this is the easiest way to illustrate why in a single sentence. Nobody reading this is looking for a survey of the evidence base on deworming; they’re reading it as an introduction to thinking critically about interventions. Bottom line: MacAskill probably should’ve found a better example/line of defense that was literally true, but even this literally false claim serves its purpose in making a broader, true point.
Interpreting GiveWell literally. Jan’s comment was perfect: GiveWell is not the supreme authority on how to interpret their numbers. Holden prefers to give extra weight to expected values with low uncertainty, MacAskill doesn’t, and that’s a legitimate disagreement. In any case, if you think people shouldn’t ever interpret GiveWell’s estimates literally when pitching EA, that’s not a problem with MacAskill, it’s a problem with >90% of the EA community. Bottom line: I think you should drop this argument, I just don’t think it’s correct.
Misrepresenting Charity Navigator. As MacAskill admits, it’s inaccurate to conflate overhead costs and CEO pay. Good find, the specific criticism was correct. But after thinking it through, I think MacAskill’s argument, while botching that single detail, is still a fair criticism of an accurate overall characterization of Charity Navigator. Let’s focus on the donut example. MacAskill says that if a donut charity had a low-paid CEO, CN would rate them highly. You correctly identify that CN cares about things other than CEO pay, and is willing to give good ratings to charities with highly paid CEOs if they do well on other metrics, namely financial stability, accountability, and transparency. BUT, MacAskill’s point I believe would be that none of those other CN metrics have to do with the effectiveness of the intervention or the cause area. CN will let financial stability and low employee costs outweigh a highly-paid CEO, but they won’t let a terrible cause bring down your rating. So if you had a highly efficient, financial well-managed donut charity, CN really would give them a good rating. Bottom line: MacAskill mistakenly conflates CEO pay with overhead costs. But that’s incredibly minor, and no reader is going to be annoyed by it. His fundamental point is correct: CN doesn’t care about cause area or intervention effectiveness, and that’s silly to the point of absurdity.
Further, even if you still think MacAskill unfairly represented CN’s position, I’m willing to cut him a bit slack on it. Do check out their hit piece on effective altruism. It’s aggressive, demeaning, and rude. Yes, it would’ve been better if MacAskill took the perfect high road, but if the inaccuracy really is minor, I think we can excuse it.
Exaggerating PlayPump’s failures. At first, I bought what you said in your comment. Everyone can read what you have to say themselves, but basically, it seems like MacAskill may have exaggerated the reports he cites discussing the failures of the PlayPump. But after a quick Google, it seems like this is another example of a specific line of argumentation that really isn’t rigorous, but that tries to make a fair point in a single sentence. PlayPump was a disaster, everyone agrees, and MacAskill was absolutely not the first to say so. So although MacAskill could’ve better explained specifically why it was a failure, without exaggerating reports, his conclusion is completely fair. I absolutely agree with the importance of honesty, and that bad arguments for a good conclusion are not justified. But this is popular writing, and he really doesn’t have space to fully review all the ins and outs of PlayPumps. Bottom line: I wish MacAskill more accurately justified his view, but nobody who looks into this should feel misled about the overall point of the failure of PlayPumps.
Conclusion: I think you correctly identify several inaccuracies in DGB. But after looking into them myself, I think you really overestimated the importance of these inaccuracies. Except perhaps the deworming example, none of these inaccuracies, if corrected, would change anything important about the conclusions of the book.
Even if you think I’m underestimating the level of inaccuracy, it seems near impossible that this is a sign of malice. If you go into a Barnes and Noble and pick out the popular nonfiction sitting to the left and right of DGB, I think you’d find dozens of inaccuracies far more important than these. Popular writing needs to oversimplify complex debates. DGB does an admirable job of preserving truth while simplifying.
I’ll reiterate that I really do believe in your good faith. You found inaccuracies, and you began worrying about MacAskill’s honesty, which drove you to find more inaccuracies. I think if you step back and consider the charitable interpretation of these flaws, though, you’ll realize that there are good reasons why they’re minor, and that it’s highly unlikely that this is the result of malice.
But finally, regardless of your conclusions on MacAskill’s honesty, I’ll say again that it’s absolutely destructive to open discourse and everyone’s goals to headline your post calling MacAskill a liar. If you want the community to engage this conversation, you have to stick to the substantive disagreements. If consensus concludes that MacAskill importantly and repeatedly fails, people will question his honesty on their own. But I think if the open debate is had, you’ll eventually come around to thinking that these inaccuracies are minor, inconsequential, and accidental.
2. GiveWell. This seems like a good argument. I will think about it.
3. CN. If you read my post and not William’s response to it, I never accuse him of conflating CEO pay and overhead. He deflects my argument by writing about this. This is indeed a minor point.
I specifically accuse him of misquoting CN. As I wrote in other comments here, yes this might indeed be CN’s position and in the end, they would judge the doughnuts charity highly. I do not contend this point and never did. I only wrote that MacAskill (1) quotes CN, (2) makes conclusions based on this quote about CN, (3) the very page that MacAskill takes the quote from says that their position does not lead to these conclusions. And maybe CN is being completely hypocritical! This is not a point. It is still dishonest to misquote them.
4. PlayPumps: I feel like you’re kind of missing the point and I’m wondering if it might be some sort of a fundamental disagreement about unstated assumptions? I think that making dishonest argument that lead to the right conclusions is still dishonest. It seems that you (and many other EAs) feel that if the conclusion is correct, then the fact that the argument was dishonest is not so important (same as with CN). Here’s what you say:
But this is popular writing, and he really doesn’t have space to fully review all the ins and outs of PlayPumps.
And here’s what I wrote in that comment specifically about this argument:
All of what you say seems reasonable. If Doing Good Better was just a popular book—I would not care about all of this stuff. But this book serves as an introduction to Effective Altruism and the whole premise of the book is that it’s objective and uses evidence to arrive to conclusions, etc, and advocates and evidence-based approach to philanthropy. And, although I don’t consider myself EA, a lot of my friends do, and I care about the movement. …
So we cannot judge the book as we would any other popular book where the author has a narrative and peppers it with random studies they found. I’m not so bothered by the misrepresentations per se but by the hypocrisy. …
just in the Introduction, William first trashes PlayPumps (not saying a single good word about them and very liberally exaggerating his sources) and then praises deworming almost as a salvation. And again, this is entirely natural for a popular book—but not for a book that introduces Effective Altruism and evidence-based approach to philanthropy. …
3. MacAskill:
According to the UNICEF report, children sometimes fell off and broke limbs, and some vomited from the spinning. [emphasis mine]
UNICEF report:
Some users reported that children had fallen off and been injured with bruises and cuts, and in one case a child fractured their arm.[emphasis mine]
This is a very good example of a point I’m making—of course a popular book will exaggerate things like that. But again—not a book that advocates an even-handed, evidence-based approach to philanthropy.
And in your conclusion you write:
Except perhaps the deworming example, none of these inaccuracies, if corrected, would change anything important about the conclusions of the book.
Yes! I mostly agree with this! But (1) these are not just inaccuracies. I point out misrepresentations. (2) I believe that making dishonest arguments that advance the right conclusions is dishonest.
Do I understand you correctly that you disagree with me on point (2)?
First, on honesty. As I said above, I completely agree with you on honesty: “bad arguments for a good conclusion are not justified.” This is one of my (and I’d say the EA community as a whole) strongest values. Arguments are not soldiers, their only value is in their own truth. SSC’s In Favor of Niceness, Community, and Civilization sums up my views very well. I’m glad we’re after the same goal.
That said, in popular writing, it’s impossible to reflect the true complexity of what’s being described. So the goal is to simplify as much as possible, while losing as little truth as possible. If someone simplifies in a way that’s importantly misleading, that’s an important failure and should be condemned. But the more I dig into each of these arguments, the more I’m convinced MacAskill is doing a very good job maintaining truth while simplifying.
Charity Navigator. MacAskill says “One popular way of evaluating a charity is to look at financial information regarding how the charity spends its money.” He says that CN takes this approach, and then quotes CN saying that many of the best charities spend 25% or less on overhead. You say this is a misquote, because CN later says that high overhead can be OK if balanced by other indicators of financial health. CN says they like to see charities “that are able to grow their revenue at least at the rate of inflation, that continue to invest in their programs and that have some money saved for a rainy day.”
I see absolutely no misrepresentation here. MacAskill says CN evaluates based on financials such as overhead pay, and quotes CN saying that. He never says that CN only looks at overhead pay, neglecting other financials. In fact, his quote of CN says that overhead indicator is a “strong indicator” in “most” charities, which nobody would interpret as claiming that CN literally only evaluates overhead. The fact that CN does in fact care about financials other than overhead is abundantly clear when reading MacAskill’s summary. MacAskill perfectly represents their view. I doubt someone from CN would ever take issue with that first paragraph.
Playpumps. Charge by charge: 1. After checking out both the UN and SKAT reports, I agree with MacAskill: they’re “damning”. 2. MacAskill says “But in order to pump water, PlayPumps need constant force, and children playing on them would quickly get exhausted.” You quote UNICEF saying “Some primary school children complained of becoming tired very quickly after pushing the pump, particularly as additional torque is required with each rotation to commence the upstroke of the piston.” Look at a video of one in motion, it’s clear that it spins easy for a little while but also constantly requires new force. No misrepresentation. 3. “Children sometimes fell off and broke limbs” is an exaggeration. One child fractured their arm, not multiple. MacAskill misrepresented the number of injuries. 4. The reporter said that PlayPump requires 27 hours of pumping a day in order to meet its ambition of supplying 15 liters a day to 10 million people using 4000 PlayPumps. Assuming one PlayPump per village, that means a village of 2500 would require 27 hours a day of PlayPump to meet their water needs. The only editorializing MacAskill does is call a village of 2500 “typical”. No misrepresentation. 5. MacAskill that PlayPumps often replaced old pumps. You correctly point out that in most countries, that did not happen. Bottom line: You’re right that (i) MacAskill exaggerates the number of children who broke bones; it was one reported case, not multiple; and (ii) MacAskill incorrectly implies that PlayPumps often replaced old pumps, when in fact they rarely did.
Again, thank you for continuing to engage in this in a fair and receptive way. But after spending a lot of time looking into this, I’m less convinced than I ever was of your argument. You have four good points: (i) MacAskill should’ve used other deworming evidence; (ii) MacAskill exaggerated the number of children who broke bones on PlayPumps; (iii) MacAskill incorrectly implies that PlayPumps often replaced old pumps, when in fact they rarely did; (iv) MacAskill incorrectly reported the question asked by a survey on ethical companies. You might have a good point with the John Bunker DALY estimates, but I haven’t looked into it enough.
Framed in the right way, these four points would be helpful, useful feedback for MacAskill. Four slips in 200 pages seems impressively good, but MacAskill surely would have promptly updated his Errata page, and that would be that. Nothing significant whatsoever about the book would’ve changed. But because they were framed as “William MacAskill is a liar”, nobody else has been willing to engage your points, lest they legitimize clearly unfair criticism. Yes, he didn’t make the best response to your points, but to be frank, they were quite unorganized and hard to follow—it’s taken me upwards of 5 hours in sum to get to the bottom of your claims.
At this point, I really don’t think you can justifiably continue to hold your either of your positions: that DGB is significantly inaccurate, or that MacAskill is dishonest. I really do believe that you’re in this in good faith, and that your main error (save the ad hominem attack, likely a judgement error) was in not getting to the bottom of these questions. But now the questions feel very well resolved. Unless the four issues listed above constitute systemic inaccuracy, I really don’t see an argument for it.
Sincerely, thank you for engaging, and if you find these arguments correct, I hope you’ll uphold our value of honesty and apologize to MacAskill for the ad hominem attacks, as well as give him a kinder, more accurate explanation of his inaccuracies. I hope I’ve helped.
Thank you a ton for the time and effort you put into this. I find myself disagreeing with you, but this may reflect my investment in my arguments. I will write to you later, once I reflect on this further.
PlayPumps: I don’t agree with your assessment of points 1, 2, 4.
At this point, I really don’t think you can justifiably continue to hold your either of your positions: that DGB is significantly inaccurate, or that MacAskill is dishonest. I really do believe that you’re in this in good faith, and that your main error (save the ad hominem attack, likely a judgement error) was in not getting to the bottom of these questions. But now the questions feel very well resolved. Unless the four issues listed above constitute systemic inaccuracy, I really don’t see an argument for it.
Sincerely, thank you for engaging, and if you find these arguments correct, I hope you’ll uphold our value of honesty and apologize to MacAskill for the ad hominem attacks, as well as give him a kinder, more accurate explanation of his inaccuracies. I hope I’ve helped.
I have already apologized to MacAskill for the first, even harsher, version of the post. I will certainly apologize to him, if I conclude that the arguments he made were not made in bad faith, but at this point I find that my central point stands.
As I wrote in another comment, thank you for your time and I will let you know later about my conclusions. I will likely rewrite the post after this.
There, I point out that MacAskill responds not to any of the published versions of the essay but to a confidential draft (since he says that I’m quoting him on something that I only quoted him about in a draft).
What do you think about it? Is my interpretation here plausible? What are the other plausible explanations for this? Maybe I fail to see charitable interpretations of how that happened.
I’m not sure how EA Forum displays drafts. It seems very plausible that, on this sometimes confusing platform, you’re mistaken as to which draft was available where and when. If you’re implying that the CEA employee sent MacAskill the draft, then yes, they should not have done that, but MacAskill played no part in that. Further, it seems basic courtesy to let someone respond to your arguments before you publicly call them a liar—you should’ve allowed MacAskill a chance to respond without immediate time pressure.
I’m sorry, this was my fault. You sent me a draft and asked me not to share it, and a few days later in rereading the email and deciding what to do with it, I wasn’t careful and failed to read the part where you asked me not to share it. I shared it with Will at that point, and I apologize for my carelessness.
Well, happens. Although if you forwarded it to Will, then he probably read the part of an email where I ask not to share it with anybody, but proceeded to read that draft and respond to a confidential draft anyway.
I’ve defended MacAskill extensively here, but why are people downvoting to hide this legitimate criticism? MacAskill acknowledged that he did this and apologized.
If there’s a reason please say so, I might be missing something. But downvoting a comment until it disappears without explaining why seems harsh. Thanks!
I didn’t downvote the comment, but it did seem a little harsh to me. I can easily imagine being forwarded a draft article, and reading the text the person forwarding wrote, then looking at the draft, without reading the text in the email they were originally sent. (Hence missing text saying the draft was supposed to be confidential.) Assuming that Will read the part saying it was confidential seemed uncharitable to me (though it turns out to be correct). That seemed in surprising contrast to the understanding attitude taken to Julia’s mistake.
I should note that now we know that William did in fact know that the draft was confidential. Quoting a comment of his above:
In hindsight, once I’d seen that you didn’t want the post shared I should have simply ignored it, and ensured you knew that it had been accidentally shared with me.
I second Julia in her apology. In hindsight, once I’d seen that you didn’t want the post shared I should have simply ignored it, and ensured you knew that it had been accidentally shared with me.
When it was shared with me, the damage had already been done, so I thought it made sense to start prepping a response. I didn’t think your post would change significantly, and at the time I thought it would be good for me to start going through your critique to see if there were indeed grave mistakes in DGB, and offer a speedy response for a more fruitful discussion. I’m sorry that I therefore misrepresented you. As you know, the draft you sent to Julia was quite a bit more hostile than the published version; I can only say that as a result of this I felt under attack, and that clouded my judgment.
As you know, the draft you sent to Julia was quite a bit more hostile than the published version
And the first draft that I sent to my friends was much more hostile than that. Every draft gets toned down and corrected a lot. This is precisely why I ask everybody not to share them.
[comment I’m likely to regret writing; still seems right]
It seems lot of people are reacting by voting, but the karma of the post is 0. It seems to me up-votes and down-votes are really not expressive enough, so I want to add a more complex reaction.
It is really very unfortunate that the post is framed around the question whether Will MacAskill is or is not honest. This is wrong, and makes any subsequent discussion difficult. (strong down-vote) (Also the conclusion (“he is not”) is not really supported by the evidence.)
It is (and was even more in the blog version) over-zealous, interpreting things uncharitably, and suggesting extreme actions. (downvote)
At the same time, it seems really important to have an open and critical discussion, and culture where people can challenge ‘canonical’ EA books and movement leaders. (upvote)
Carefully going through the sources and checking if papers are not cherry-picked and represented truthfully is commendable. (upvote)
Having really good epistemics is really important, in particular with the focus on long-term. Vigilance in this direction seems good. (upvote)
So it seems really a pity the post was not framed as a question somewhere in the direction “do you thing this is epistemically good?”
If I try to imagine something like “steel-maned version of the post”, without questioning honesty, and without making uncharitable inferences, the reaction could have been some useful discussion.
It seems to me
“Doing Good Better” is sometimes more on the “explaining & advocacy of ideas” side than “dispassionate representation of research”.
Given the genre, I would bet the book is in the top quartile on the metric of representing research correctly.
In some of the examples, it seems adding more caveats and reporting in more detail would have been better for readers interested in precision. Likely at the cost of making the text more dry.
Some emotions sometimes creep in: In case of the somewhat uncharitable part about Charity Evaluator, I remembered their much more uncharitable / misrepresenting text attacking effective altruism and GiveWell. Also while they talk about importance of other things, what they actually measure is actually wrong, and criticized correctly. In case of the whole topic… well a lot of evidence points toward things like 100x multiplier being true, meaning that yes, actually it is possible to save many more people. It seems hard to not to have some passion.
Given that several books about long-term future are now written, the update I would take from that is the books mostly about long-term should err more the side of caveating and describing disagreements and explaining uncertainty more, but my feeling is shift in this direction already happened between 2014 and 2018.
I agree with all the points you make here, including on the suggested upvote/downvote distribution, and on the nature of DGB. FWIW, my (current, defeasible) plan for any future trade books I write is that they’d be more highbrow (and more caveated, and therefore drier) than DGB.
I think that’s the right approach for me, at the moment. But presumably at some point the best thing to do (for some people) will be wider advocacy (wider than DGB), which will inevitably involve simplification of ideas. So we’ll have to figure out what epistemic standards are appropriate in that context (given that GiveWell-level detail is off the table).
Some preliminary thoughts on heuristics for this (these are suggestions only):
Standards we’d want to keep as high as ever:
Is the broad brush strokes picture of what is being conveyed accurate? Is there any easy way the broad brush of what is conveyed could have been made more accurate?
Are the sentences being used to support this broad brush strokes picture warranted by the evidence?
Is this the way of communicating the core message about as caveated and detailed as one can reasonably manage?
Standards we’d need to relax:
Does this communicate as much detail as possible with respect to the relevant claims?
Does this communicate all the strongest possible counterarguments to the key claim?
Does this include every reasonable caveat?
I think that a blogpost that does very well with respect to the above, without compromising on the clarity of the core message, is Max Roser’s recent post: ‘The world is much better; The world is awful; The world can be much better’.
Thanks. I think the criteria which standards to keep and which to relax you propose are reasonable.
It seems an important question. I would like someone trying it to study more formally, using for example “value of information” or “rational inattention” frameworks. I can imagine experiments like giving people a longer list of arguments and trying to gather feedback on what was the value for them and then making decisions based on that. (Now this seems to be done mainly based on author’s intuitions.)
I agree Max’s post is doing a really good job!
Hi Jan,
Thanks for the feedback.
You write:
I should point out that in the post I show not just a lack of caveats and details. William misrepresents the evidence. Among other things, he:
cherry picks the variables from a deworming paper he cites
interprets GW’s AMF estimate in a way they specifically asked not to interpret them (“five hundred times” more effective thing — Holden wrote specifically about such arguments that they seem to require taking cost-effectiveness estimates literally)
quotes two sentences from Charity Navigator’s site when the very next sentence shows that the interpretation of the previous sentences is wrong
In a long response William posted here, he did not address any of these points:
he doesn’t mention cherry picking (and neither does his errata page)
he doesn’t mention the fact that GiveWell asked not to interpret their AMF estimate literally
and he writes “I represent CN fairly, and make a fair criticism of its approach to assessing charities.”, which may be true about some general CN’s position, but which has nothing to do with misquoting Charity Navigator.
If the issue was just a lack of detail, of course I would not have written the post in such a tone. Initially, I considered simply emailing him a list of mistakes that I found, but as I mentioned in the post, the volume and egregiousness of misrepresentations lead me to conclude that he argued in bad faith.
edit: I will email GiveWell to clarify what they think about William making claims about 500 times more benefit on the basis of their AMF estimate.
I think I understand you gradually become upset, but it seems in the process you started to miss the more favorable interpretations.
For example, with the “interpretation of the GiveWell estimates”: based on reading a bunch of old discussions on archive, my _impression_ is there was at least in some point of time a genuine disagreement about how to interpret the numbers between Will, Tobi, Holden and possibly others (there was much less disagreement about the numeric values). So if this is the case, it is plausible Will was using his interpretation of the numbers, which was in some sense “bolder” than the GW interpretation. My sense of good epistemic standard is you certainly can do this, but should add a caveat with warning that the authors of the numbers have a different interpretation of them (so it is a miss of caveat). At the same time I can imagine how you can fail to do this without any bad faith—for example, if you are at some point of the discussion confused whether some object-level disagreement continues or not (especially if you ask the other party in the disagreement to check the text). Also, if my impression is correct and the core of the object-level disagreement was quite technical question regarding proper use of Bayesian statistics and EV calculations, it does not seem obvious how to report the disagreement to general public.
In general: switching to the assumption someone is deliberately misleading is a highly slippery slope: it seems with this sort of assumption you can kind of explain everything, often easily, and if you can’t e.g. speak to people in person it may be quite difficult to find anything which would make you the update in the opposite direction.
About cost-effectiveness estimates: I don’t think your interpretation is plausible. The GiveWell page that gives the $3400 estimate, specifically asks not to interpret it literally.
About me deciding that MacAskill is deliberately misleading. Please see my comment in /r/slatestarcodex in response to /u/scottalexander about it. Would love to know what you think.
[because of time constrains, I will focus on just one example now]
Yes, but GiveWell is not some sort of ultimate authority on how their numbers should be interpreted. Take an ab absurdum example: NRA publishes some numbers about guns, gun-related violence, and their interpretation that there are not enough guns in the US and gun violence is low. If you basically agree with numbers, but disagree with their interpretation, surely you can use the numbers and interpret them in a different way.
GiveWell reasoning is explained in this article. Technically speaking you _can_ use the numbers directly as EV estimates if you have a very broad prior, and the prior is all the same across all the actions you are comparing. (You can argue this is technically not the right thing to do, or you can argue that GiveWell advises people not to do it.) As I stated in my original comment, I’d appreciate if such disagreements are reported. At the same time it seems difficult to do it properly in a popular text. I can imagine something like this
being more precise, but you can probably see it is a very different book now. I’d be quite interested in how you would write the paragraph if you wanted to use the number, wanted to give numerical estimate of the cost per live saved and did not want to explain to the reader Bayesian estimates.
This seems like a good argument. Thank you. I will think about it.
Guzey, would you consider rewriting this post, framing it not as questioning MacAskill’s honesty but rather just pointing out some flaws in the representation of research? I fully buy some of your criticisms (it was an epistemic failure to not report that deworming has no effect on test scores, misrepresent Charity Navigator’s views, and misrepresent the “ethical employer” poll). And I think Jan’s views accurately reflect the community’s views: we want to be able to have open discussion and criticism, even of the EA “canon.” But it’s absolutely correct that the personal attacks on MacAskill’s integrity make it near impossible to have this open discussion.
Even if you’re still convinced that MacAskill is dishonest, wouldn’t the best way to prove it to the community be to have a thorough, open debate over these factual question? Then, if it becomes clear that your criticisms are correct, people will be able to judge the honesty issue themselves. I think you’re limiting your own potential here by making people not want to engage with your ideas.
I’d be happy to engage with the individual criticisms here and have some back and forth, if only this was written in a less ad hominem way.
Separately, does anyone have thoughts on the John Bunker DALY estimate? MacAskill claims that a developed world doctor only creates 7 DALYs, Bunker’s paper doesn’t seem to say anything like this, and this 80,000 Hours blog estimates instead that a developed world doctor creates 600 QALYs. Was MacAskill wrong on the effectiveness of becoming a doctor?
Hi smithee,
I do wonder if I should’ve written this post in a less personal tone. I will consider writing a follow up to it.
About me deciding that MacAskill is deliberately misleading, please see my comment in /r/slatestarcodex in response to /u/scottalexander about it. Would love to know what you think.
I’ll headline this by saying that I completely believe you’re doing this in good faith, I agree with several of your criticisms, and I think this deserves to be openly discussed. But I also strongly disagree with your conclusion about MacAskill’s honesty, and, even if I thought it was plausible, it still would be an unnecessary breach of etiquette that makes open conversation near impossible. I really think you should stop making this an argument about MacAskill’s personal honesty. Have the facts debate, leave ad hominem aside so everyone can fully engage, and if you’re proven right on the facts, then raise your honesty concerns.
First I’d like to address your individual points, then your claims about MacAskill.
Misreporting the deworming study. I think this is your best point. It seems entirely correct that if textbooks fail because they don’t improve test scores, that deworming should fail by the same metric. But I agree with /u/ScottAlexander that, in popular writing, you often don’t have the space to specifically go through all the literature on why deworming is better. MacAskill’s deworming claims were misleading on one level, in that the specific argument he provided is not a good one, but also fair on another level: MacAskill/GiveWell has looked tons into deworming, concluded that it’s better than textbooks, and this is the easiest way to illustrate why in a single sentence. Nobody reading this is looking for a survey of the evidence base on deworming; they’re reading it as an introduction to thinking critically about interventions. Bottom line: MacAskill probably should’ve found a better example/line of defense that was literally true, but even this literally false claim serves its purpose in making a broader, true point.
Interpreting GiveWell literally. Jan’s comment was perfect: GiveWell is not the supreme authority on how to interpret their numbers. Holden prefers to give extra weight to expected values with low uncertainty, MacAskill doesn’t, and that’s a legitimate disagreement. In any case, if you think people shouldn’t ever interpret GiveWell’s estimates literally when pitching EA, that’s not a problem with MacAskill, it’s a problem with >90% of the EA community. Bottom line: I think you should drop this argument, I just don’t think it’s correct.
Misrepresenting Charity Navigator. As MacAskill admits, it’s inaccurate to conflate overhead costs and CEO pay. Good find, the specific criticism was correct. But after thinking it through, I think MacAskill’s argument, while botching that single detail, is still a fair criticism of an accurate overall characterization of Charity Navigator. Let’s focus on the donut example. MacAskill says that if a donut charity had a low-paid CEO, CN would rate them highly. You correctly identify that CN cares about things other than CEO pay, and is willing to give good ratings to charities with highly paid CEOs if they do well on other metrics, namely financial stability, accountability, and transparency. BUT, MacAskill’s point I believe would be that none of those other CN metrics have to do with the effectiveness of the intervention or the cause area. CN will let financial stability and low employee costs outweigh a highly-paid CEO, but they won’t let a terrible cause bring down your rating. So if you had a highly efficient, financial well-managed donut charity, CN really would give them a good rating. Bottom line: MacAskill mistakenly conflates CEO pay with overhead costs. But that’s incredibly minor, and no reader is going to be annoyed by it. His fundamental point is correct: CN doesn’t care about cause area or intervention effectiveness, and that’s silly to the point of absurdity.
Further, even if you still think MacAskill unfairly represented CN’s position, I’m willing to cut him a bit slack on it. Do check out their hit piece on effective altruism. It’s aggressive, demeaning, and rude. Yes, it would’ve been better if MacAskill took the perfect high road, but if the inaccuracy really is minor, I think we can excuse it.
Exaggerating PlayPump’s failures. At first, I bought what you said in your comment. Everyone can read what you have to say themselves, but basically, it seems like MacAskill may have exaggerated the reports he cites discussing the failures of the PlayPump. But after a quick Google, it seems like this is another example of a specific line of argumentation that really isn’t rigorous, but that tries to make a fair point in a single sentence. PlayPump was a disaster, everyone agrees, and MacAskill was absolutely not the first to say so. So although MacAskill could’ve better explained specifically why it was a failure, without exaggerating reports, his conclusion is completely fair. I absolutely agree with the importance of honesty, and that bad arguments for a good conclusion are not justified. But this is popular writing, and he really doesn’t have space to fully review all the ins and outs of PlayPumps. Bottom line: I wish MacAskill more accurately justified his view, but nobody who looks into this should feel misled about the overall point of the failure of PlayPumps.
Conclusion: I think you correctly identify several inaccuracies in DGB. But after looking into them myself, I think you really overestimated the importance of these inaccuracies. Except perhaps the deworming example, none of these inaccuracies, if corrected, would change anything important about the conclusions of the book.
Even if you think I’m underestimating the level of inaccuracy, it seems near impossible that this is a sign of malice. If you go into a Barnes and Noble and pick out the popular nonfiction sitting to the left and right of DGB, I think you’d find dozens of inaccuracies far more important than these. Popular writing needs to oversimplify complex debates. DGB does an admirable job of preserving truth while simplifying.
I’ll reiterate that I really do believe in your good faith. You found inaccuracies, and you began worrying about MacAskill’s honesty, which drove you to find more inaccuracies. I think if you step back and consider the charitable interpretation of these flaws, though, you’ll realize that there are good reasons why they’re minor, and that it’s highly unlikely that this is the result of malice.
But finally, regardless of your conclusions on MacAskill’s honesty, I’ll say again that it’s absolutely destructive to open discourse and everyone’s goals to headline your post calling MacAskill a liar. If you want the community to engage this conversation, you have to stick to the substantive disagreements. If consensus concludes that MacAskill importantly and repeatedly fails, people will question his honesty on their own. But I think if the open debate is had, you’ll eventually come around to thinking that these inaccuracies are minor, inconsequential, and accidental.
Thank you for a thoughtful response.
1. Deworming. Seems fair.
2. GiveWell. This seems like a good argument. I will think about it.
3. CN. If you read my post and not William’s response to it, I never accuse him of conflating CEO pay and overhead. He deflects my argument by writing about this. This is indeed a minor point.
I specifically accuse him of misquoting CN. As I wrote in other comments here, yes this might indeed be CN’s position and in the end, they would judge the doughnuts charity highly. I do not contend this point and never did. I only wrote that MacAskill (1) quotes CN, (2) makes conclusions based on this quote about CN, (3) the very page that MacAskill takes the quote from says that their position does not lead to these conclusions. And maybe CN is being completely hypocritical! This is not a point. It is still dishonest to misquote them.
4. PlayPumps: I feel like you’re kind of missing the point and I’m wondering if it might be some sort of a fundamental disagreement about unstated assumptions? I think that making dishonest argument that lead to the right conclusions is still dishonest. It seems that you (and many other EAs) feel that if the conclusion is correct, then the fact that the argument was dishonest is not so important (same as with CN). Here’s what you say:
And here’s what I wrote in that comment specifically about this argument:
And in your conclusion you write:
Yes! I mostly agree with this! But (1) these are not just inaccuracies. I point out misrepresentations. (2) I believe that making dishonest arguments that advance the right conclusions is dishonest.
Do I understand you correctly that you disagree with me on point (2)?
First, on honesty. As I said above, I completely agree with you on honesty: “bad arguments for a good conclusion are not justified.” This is one of my (and I’d say the EA community as a whole) strongest values. Arguments are not soldiers, their only value is in their own truth. SSC’s In Favor of Niceness, Community, and Civilization sums up my views very well. I’m glad we’re after the same goal.
That said, in popular writing, it’s impossible to reflect the true complexity of what’s being described. So the goal is to simplify as much as possible, while losing as little truth as possible. If someone simplifies in a way that’s importantly misleading, that’s an important failure and should be condemned. But the more I dig into each of these arguments, the more I’m convinced MacAskill is doing a very good job maintaining truth while simplifying.
Charity Navigator. MacAskill says “One popular way of evaluating a charity is to look at financial information regarding how the charity spends its money.” He says that CN takes this approach, and then quotes CN saying that many of the best charities spend 25% or less on overhead. You say this is a misquote, because CN later says that high overhead can be OK if balanced by other indicators of financial health. CN says they like to see charities “that are able to grow their revenue at least at the rate of inflation, that continue to invest in their programs and that have some money saved for a rainy day.”
I see absolutely no misrepresentation here. MacAskill says CN evaluates based on financials such as overhead pay, and quotes CN saying that. He never says that CN only looks at overhead pay, neglecting other financials. In fact, his quote of CN says that overhead indicator is a “strong indicator” in “most” charities, which nobody would interpret as claiming that CN literally only evaluates overhead. The fact that CN does in fact care about financials other than overhead is abundantly clear when reading MacAskill’s summary. MacAskill perfectly represents their view. I doubt someone from CN would ever take issue with that first paragraph.
Playpumps. Charge by charge: 1. After checking out both the UN and SKAT reports, I agree with MacAskill: they’re “damning”. 2. MacAskill says “But in order to pump water, PlayPumps need constant force, and children playing on them would quickly get exhausted.” You quote UNICEF saying “Some primary school children complained of becoming tired very quickly after pushing the pump, particularly as additional torque is required with each rotation to commence the upstroke of the piston.” Look at a video of one in motion, it’s clear that it spins easy for a little while but also constantly requires new force. No misrepresentation. 3. “Children sometimes fell off and broke limbs” is an exaggeration. One child fractured their arm, not multiple. MacAskill misrepresented the number of injuries. 4. The reporter said that PlayPump requires 27 hours of pumping a day in order to meet its ambition of supplying 15 liters a day to 10 million people using 4000 PlayPumps. Assuming one PlayPump per village, that means a village of 2500 would require 27 hours a day of PlayPump to meet their water needs. The only editorializing MacAskill does is call a village of 2500 “typical”. No misrepresentation. 5. MacAskill that PlayPumps often replaced old pumps. You correctly point out that in most countries, that did not happen. Bottom line: You’re right that (i) MacAskill exaggerates the number of children who broke bones; it was one reported case, not multiple; and (ii) MacAskill incorrectly implies that PlayPumps often replaced old pumps, when in fact they rarely did.
Again, thank you for continuing to engage in this in a fair and receptive way. But after spending a lot of time looking into this, I’m less convinced than I ever was of your argument. You have four good points: (i) MacAskill should’ve used other deworming evidence; (ii) MacAskill exaggerated the number of children who broke bones on PlayPumps; (iii) MacAskill incorrectly implies that PlayPumps often replaced old pumps, when in fact they rarely did; (iv) MacAskill incorrectly reported the question asked by a survey on ethical companies. You might have a good point with the John Bunker DALY estimates, but I haven’t looked into it enough.
Framed in the right way, these four points would be helpful, useful feedback for MacAskill. Four slips in 200 pages seems impressively good, but MacAskill surely would have promptly updated his Errata page, and that would be that. Nothing significant whatsoever about the book would’ve changed. But because they were framed as “William MacAskill is a liar”, nobody else has been willing to engage your points, lest they legitimize clearly unfair criticism. Yes, he didn’t make the best response to your points, but to be frank, they were quite unorganized and hard to follow—it’s taken me upwards of 5 hours in sum to get to the bottom of your claims.
At this point, I really don’t think you can justifiably continue to hold your either of your positions: that DGB is significantly inaccurate, or that MacAskill is dishonest. I really do believe that you’re in this in good faith, and that your main error (save the ad hominem attack, likely a judgement error) was in not getting to the bottom of these questions. But now the questions feel very well resolved. Unless the four issues listed above constitute systemic inaccuracy, I really don’t see an argument for it.
Sincerely, thank you for engaging, and if you find these arguments correct, I hope you’ll uphold our value of honesty and apologize to MacAskill for the ad hominem attacks, as well as give him a kinder, more accurate explanation of his inaccuracies. I hope I’ve helped.
Thank you a ton for the time and effort you put into this. I find myself disagreeing with you, but this may reflect my investment in my arguments. I will write to you later, once I reflect on this further.
CN: I don’t agree with you
PlayPumps: I don’t agree with your assessment of points 1, 2, 4.
I have already apologized to MacAskill for the first, even harsher, version of the post. I will certainly apologize to him, if I conclude that the arguments he made were not made in bad faith, but at this point I find that my central point stands.
As I wrote in another comment, thank you for your time and I will let you know later about my conclusions. I will likely rewrite the post after this.
Also, I wonder what you think about the second half of this comment of mine in this thread.
There, I point out that MacAskill responds not to any of the published versions of the essay but to a confidential draft (since he says that I’m quoting him on something that I only quoted him about in a draft).
What do you think about it? Is my interpretation here plausible? What are the other plausible explanations for this? Maybe I fail to see charitable interpretations of how that happened.
I’m not sure how EA Forum displays drafts. It seems very plausible that, on this sometimes confusing platform, you’re mistaken as to which draft was available where and when. If you’re implying that the CEA employee sent MacAskill the draft, then yes, they should not have done that, but MacAskill played no part in that. Further, it seems basic courtesy to let someone respond to your arguments before you publicly call them a liar—you should’ve allowed MacAskill a chance to respond without immediate time pressure.
I never posted the draft that had this quote on EA Forum. Further, I clearly asked everyone I sent the drafts not to share them with anybody.
I’m sorry, this was my fault. You sent me a draft and asked me not to share it, and a few days later in rereading the email and deciding what to do with it, I wasn’t careful and failed to read the part where you asked me not to share it. I shared it with Will at that point, and I apologize for my carelessness.
Well, happens. Although if you forwarded it to Will, then he probably read the part of an email where I ask not to share it with anybody, but proceeded to read that draft and respond to a confidential draft anyway.
I’ve defended MacAskill extensively here, but why are people downvoting to hide this legitimate criticism? MacAskill acknowledged that he did this and apologized.
If there’s a reason please say so, I might be missing something. But downvoting a comment until it disappears without explaining why seems harsh. Thanks!
I didn’t downvote the comment, but it did seem a little harsh to me. I can easily imagine being forwarded a draft article, and reading the text the person forwarding wrote, then looking at the draft, without reading the text in the email they were originally sent. (Hence missing text saying the draft was supposed to be confidential.) Assuming that Will read the part saying it was confidential seemed uncharitable to me (though it turns out to be correct). That seemed in surprising contrast to the understanding attitude taken to Julia’s mistake.
I should note that now we know that William did in fact know that the draft was confidential. Quoting a comment of his above:
That’s what I meant by ‘though it turns out to be correct’. Sorry for being unclear.
comment above has 3 votes, −7 score, 0 replies
I second Julia in her apology. In hindsight, once I’d seen that you didn’t want the post shared I should have simply ignored it, and ensured you knew that it had been accidentally shared with me.
When it was shared with me, the damage had already been done, so I thought it made sense to start prepping a response. I didn’t think your post would change significantly, and at the time I thought it would be good for me to start going through your critique to see if there were indeed grave mistakes in DGB, and offer a speedy response for a more fruitful discussion. I’m sorry that I therefore misrepresented you. As you know, the draft you sent to Julia was quite a bit more hostile than the published version; I can only say that as a result of this I felt under attack, and that clouded my judgment.
And the first draft that I sent to my friends was much more hostile than that. Every draft gets toned down and corrected a lot. This is precisely why I ask everybody not to share them.
Just wanted to note that now we know that MacAskill knew that the draft was confidential.