The broader concern I share is the risk of data moving from experts to semi-experts to non-experts, with a loss of understanding at each stage. This is basically a ubiquitous problem, and EA is no exception. From looking into this back in 2013 I understand well where these numbers come from, the parts of the analysis that make me most nervous, and what they can and can’t show. But I think it’s fair to say that there has existed a risk of derivative works being produced by people dabbling in the topic on a tough schedule, and i) losing the full citation, or ii) accidentally presenting the numbers in a misleading way.
A classic case of this playing out at the moment is the confusion around GiveWell’s estimated ‘cost per life saved’ for AMF, vs the new ‘cost per life saved equivalent’. GiveWell has tried, but research communication is hard. I feel sorry for people who engage in EA advocacy part time as it’s very easy for them to get a detail wrong, or have their facts out of date (snap quiz, how probable is each of these in light of the latest research: deworming impacts i) weight, ii) school attendance, iii) incomes later in life?). This stuff should be corrected, but with love, as folks are usually doing their best, and not everyone can be expected to fully understand or keep up with research in effective altruism.
One valuable thing about this debate has been that it reminds us that people working on communicating ideas need to speak with the experts who are aware of the details and stress about getting things as accurate as they can be in practice. Ideally one individual should become the point-person who truly understands any complex data source (and gets replaced when staff move on).
The nature of the correction, I think, is that I underestimated how much individual caution there was in coming up with the original numbers. I was suggesting some amount of individual motivated cognition in generating the stitched-together dataset in the first place, and that’s what I think I was wrong about.
I still think that:
(1) The stitching-together represents a big problem and not a minor one. This is because it’s basically impossible to “sanity check” charts like this without introducing some selection bias. Each step away from the original source compounds this problem. Hugging the source data as tightly as you can and keeping track of the methodology is really the only way to fight this. Otherwise, even if there is no individual intent to mislead, we end up passing information through a long series of biased filters, and thus mainly flattering our preconceptions.
I can see the appeal of introducing individual human gatekeepers into the picture, but that comes with a pretty bad bottlenecking problem, and substitutes the bias of a single individual for the bias of the system. Having experts is great, but the point of sharing a chart is to give other people access to the underlyling information in away that’s intuitive to interpret. Robin Hanson’s post on academic vs amateur methods puts the case for this pretty clearly:
A key tradeoff in our methods is between ease and directness on the one hand, and robustness and rigor on the other. [...] When you need to make an immediate decision fast, direct easy methods look great. But when many varied people want to share an analysis process over a longer time period, more robust rigorous methods start to look better. Easy direct easy methods tend to be more uncertain and context dependent, and so don’t aggregate as well. Distant others find it harder to understand your claims and reasoning, and to judge their reliability. So distant others tend more to redo such analysis themselves rather than building on your analysis. [...]
You might think their added freedom would result in amateurs contributing proportionally more to intellectual progress, but in fact they contribute less. Yes, amateurs can and do make more initial progress when new topics arise suddenly far from topics where established expert institutions have specialized. But then over time amateurs blow their lead by focusing less and relying on easier more direct methods. They rely more on informal conversation as analysis method, they prefer personal connections over open competitions in choosing people, and they rely more on a perceived consensus among a smaller group of fellow enthusiasts. As a result, their contributions just don’t appeal as widely or as long.
GiveWell is a great example of an organization that keeps track of sources so that people who are interested can figure out how they got their numbers.
(2) It’s weird and a little sketchy that there’s not a discontinuity around 80%. This could easily be attributable to Milanovic rather than CEA, but I still think it’s a problem that that wasn’t caught, or—if there turns out to be a good explanation—documented.
(3) It’s entirely appropriate to hold CEA’s CEO (the one who used this chart at the start of the controversy you’re responding to by adding helpful information) to be held to a much higher standard than some amateur or part-time EA advocate who got excited about the implications of the chart. For this reason, while I think you’re right that it’s hard to avoid amateurs introducing large errors and substantial bias by oversimplifying things, that doesn’t seem all that relevant to the case that started this.
Hi Ben, thanks for retracting the comment.
The broader concern I share is the risk of data moving from experts to semi-experts to non-experts, with a loss of understanding at each stage. This is basically a ubiquitous problem, and EA is no exception. From looking into this back in 2013 I understand well where these numbers come from, the parts of the analysis that make me most nervous, and what they can and can’t show. But I think it’s fair to say that there has existed a risk of derivative works being produced by people dabbling in the topic on a tough schedule, and i) losing the full citation, or ii) accidentally presenting the numbers in a misleading way.
A classic case of this playing out at the moment is the confusion around GiveWell’s estimated ‘cost per life saved’ for AMF, vs the new ‘cost per life saved equivalent’. GiveWell has tried, but research communication is hard. I feel sorry for people who engage in EA advocacy part time as it’s very easy for them to get a detail wrong, or have their facts out of date (snap quiz, how probable is each of these in light of the latest research: deworming impacts i) weight, ii) school attendance, iii) incomes later in life?). This stuff should be corrected, but with love, as folks are usually doing their best, and not everyone can be expected to fully understand or keep up with research in effective altruism.
One valuable thing about this debate has been that it reminds us that people working on communicating ideas need to speak with the experts who are aware of the details and stress about getting things as accurate as they can be in practice. Ideally one individual should become the point-person who truly understands any complex data source (and gets replaced when staff move on).
The nature of the correction, I think, is that I underestimated how much individual caution there was in coming up with the original numbers. I was suggesting some amount of individual motivated cognition in generating the stitched-together dataset in the first place, and that’s what I think I was wrong about.
I still think that:
(1) The stitching-together represents a big problem and not a minor one. This is because it’s basically impossible to “sanity check” charts like this without introducing some selection bias. Each step away from the original source compounds this problem. Hugging the source data as tightly as you can and keeping track of the methodology is really the only way to fight this. Otherwise, even if there is no individual intent to mislead, we end up passing information through a long series of biased filters, and thus mainly flattering our preconceptions.
I can see the appeal of introducing individual human gatekeepers into the picture, but that comes with a pretty bad bottlenecking problem, and substitutes the bias of a single individual for the bias of the system. Having experts is great, but the point of sharing a chart is to give other people access to the underlyling information in away that’s intuitive to interpret. Robin Hanson’s post on academic vs amateur methods puts the case for this pretty clearly:
GiveWell is a great example of an organization that keeps track of sources so that people who are interested can figure out how they got their numbers.
(2) It’s weird and a little sketchy that there’s not a discontinuity around 80%. This could easily be attributable to Milanovic rather than CEA, but I still think it’s a problem that that wasn’t caught, or—if there turns out to be a good explanation—documented.
(3) It’s entirely appropriate to hold CEA’s CEO (the one who used this chart at the start of the controversy you’re responding to by adding helpful information) to be held to a much higher standard than some amateur or part-time EA advocate who got excited about the implications of the chart. For this reason, while I think you’re right that it’s hard to avoid amateurs introducing large errors and substantial bias by oversimplifying things, that doesn’t seem all that relevant to the case that started this.
What are the answers to the snap quiz btw?