Edit: The bug I mentioned below has since been fixed. The default values still do not seem to match with the figures of RP’s report here, and I believe there is also an error in said report that underestimates the impact by ~a factor of 2. See the extended discussion on this post for details.
I would advise being careful with RP’s Cross-cause effectiveness tool as it currently stands, especially with regards to the chicken campaign. There appears to be a very clear conversion error which I’ve detailed in the edit to my comment here. I was also unable to replicate their default values from their source data, but I may be missing something.
I think comments like these are valuable when they are made after the relevant parties have all had enough time to respond, the discussion is largely settled, and readers are in a position to make up their minds about the nature, magnitude and importance of the problems reported, by having access to all the information that is likely to emerge from the exchange in question. Instead, your comment cautions people to be careful in using a tool based on some issues you found and reported less than two days ago, when the discussion appears to be ongoing and some of the people involved have not even expressed an opinion, perhaps because they haven’t yet seen the thread or had enough time to digest your criticisms. Maybe these criticisms are correct and we should indeed exercise the degree of caution you advise when using the tool, but it seems not unlikely that we’ll be in a better epistemic position to know this, say, a week or so from now, so why not just wait for all the potential evidence to become available?
In the linked thread, the website owners have confirmed that there is indeed an error in the website. If you try to make calculations using their site as currently made you will be off by a factor of a thousand. They have confirmed this and have stated that this will be fixed soon. When it is fixed I will edit the shortform.
Would you prefer that for the next couple of days, during the heavily publicised AW vs GHD debate week, in which this tool has been cited multiple times, people continue to use it as is despite it being bugged and giving massively wrong results? Why are you not more concerned about flawed calculations being spread than about me pointing out that flawed calculations are being spread?
In your original shortform, you listed three separate criticisms, but your reply now focuses on just one of those criticisms, in a way that makes it look that my concerns would be invalidated if one granted the validity of that specific criticism. This is the sort of subtle goalpost moving that makes it difficult to have a productive discussion.
Why are you not more concerned about flawed calculations being spread than about me pointing out that flawed calculations are being spread?
Because there is an asymmetry in the costs of waiting. Waiting a week or so to better understand the alleged problems of a tool that will likely be used for years is a very minor cost, compared to the expected improvement in that understanding that will occur over that period.
(ETA: I didn’t downvote any of your comments, in accordance with my policy of never downvoting comments I reply to, even if I believe I would normally have downvoted them. I mention this only because your most recent comment was downvoted just as I posted this one.)
I list exactly 2 criticisms. One of them was proven correct, the other I believe to be correct also but am waiting on a response.
I agree with the asymettry in the cost of waiting, but the other way. If these errors are corrected a week from now, after the debate week has wrapped up, then everybody will have stopped paying attention to the debate, and it will become much harder to correct any BS arising from the faulty tool.
Do you truly not care that people are accidentally spreading misinformation here?
Do you truly not care that people are accidentally spreading misinformation here?
Why do you attribute to me a view I never stated and do not hold? If I say that one cost is greater than another, it doesn’t mean that I do not care about the lesser cost.
I’d probably agree with this if the tool were not relevant for Debate Week and/or RP hadn’t highlighted this tool in a recent post for Debate Week. So there’s a greater risk of any errors cascading into the broader discussion in a way that wouldn’t be practically fixable by a later notice that the tool was broken.
Edit: The bug I mentioned below has since been fixed. The default values still do not seem to match with the figures of RP’s report here, and I believe there is also an error in said report that underestimates the impact by ~a factor of 2. See the extended discussion on this post for details.
I would advise being careful with RP’s Cross-cause effectiveness tool as it currently stands, especially with regards to the chicken campaign.
There appears to be a very clear conversion error which I’ve detailed in the edit to my commenthere. I was also unable to replicate their default values from their source data, but I may be missing something.I think comments like these are valuable when they are made after the relevant parties have all had enough time to respond, the discussion is largely settled, and readers are in a position to make up their minds about the nature, magnitude and importance of the problems reported, by having access to all the information that is likely to emerge from the exchange in question. Instead, your comment cautions people to be careful in using a tool based on some issues you found and reported less than two days ago, when the discussion appears to be ongoing and some of the people involved have not even expressed an opinion, perhaps because they haven’t yet seen the thread or had enough time to digest your criticisms. Maybe these criticisms are correct and we should indeed exercise the degree of caution you advise when using the tool, but it seems not unlikely that we’ll be in a better epistemic position to know this, say, a week or so from now, so why not just wait for all the potential evidence to become available?
In the linked thread, the website owners have confirmed that there is indeed an error in the website. If you try to make calculations using their site as currently made you will be off by a factor of a thousand. They have confirmed this and have stated that this will be fixed soon. When it is fixed I will edit the shortform.
Would you prefer that for the next couple of days, during the heavily publicised AW vs GHD debate week, in which this tool has been cited multiple times, people continue to use it as is despite it being bugged and giving massively wrong results? Why are you not more concerned about flawed calculations being spread than about me pointing out that flawed calculations are being spread?
In your original shortform, you listed three separate criticisms, but your reply now focuses on just one of those criticisms, in a way that makes it look that my concerns would be invalidated if one granted the validity of that specific criticism. This is the sort of subtle goalpost moving that makes it difficult to have a productive discussion.
Because there is an asymmetry in the costs of waiting. Waiting a week or so to better understand the alleged problems of a tool that will likely be used for years is a very minor cost, compared to the expected improvement in that understanding that will occur over that period.
(ETA: I didn’t downvote any of your comments, in accordance with my policy of never downvoting comments I reply to, even if I believe I would normally have downvoted them. I mention this only because your most recent comment was downvoted just as I posted this one.)
I list exactly 2 criticisms. One of them was proven correct, the other I believe to be correct also but am waiting on a response.
I agree with the asymettry in the cost of waiting, but the other way. If these errors are corrected a week from now, after the debate week has wrapped up, then everybody will have stopped paying attention to the debate, and it will become much harder to correct any BS arising from the faulty tool.
Do you truly not care that people are accidentally spreading misinformation here?
Why do you attribute to me a view I never stated and do not hold? If I say that one cost is greater than another, it doesn’t mean that I do not care about the lesser cost.
I’d probably agree with this if the tool were not relevant for Debate Week and/or RP hadn’t highlighted this tool in a recent post for Debate Week. So there’s a greater risk of any errors cascading into the broader discussion in a way that wouldn’t be practically fixable by a later notice that the tool was broken.