Thanks for raising this. It’s a fair question but I think I disagree that the numbers you quote should be in the top level summary.
I’m wary of overemphasising precise numbers. We’re really uncertain about many parts of this question and we arrived at these numbers by making many strong assumptions, so these numbers don’t represent our all-things-considered-view and it might be misleading to state them without a lot of context. In particular, the numbers you quote came from the Guesstimate model, which isn’t where the bulk of the work on this project was focused (though we could have acknowledged that more). To my mind, the upshot of this investigation is better described by this bullet in the summary than by the numbers you quote:
In this model, in most of the most plausible scenarios, THL appears better than AMF. The difference in cost-effectiveness is usually within 1 or 2 orders of magnitude. Under some sets of reasonable assumptions, AMF looks better than THL. Because we have so much uncertainty, one could reasonably believe that AMF is more cost-effective than THL or one could reasonably believe that THL is more cost-effective than AMF.
Thanks! I appreciate your wariness of overemphasizing precise numbers and I agree that it is important to hedge your estimates in this way.
However, none of the claims in the bullet you cite give us any indication of the expected value of each intervention. For two interventions A and B, all of the following is consistent with the expected value of A being astronomically higher than the expected value of B:
B is better than A in most of the most plausible scenarios
On most models the difference in cost-effectiveness is small (within 1 or 2 orders of magnitude)
One could reasonably believe that B is better than A or that B is better than A
Extremely little information is communicated about the relative expected value of A and B by the above points, and what information is communicated misleadingly suggests that both interventions are quite close in expected value. Because EAs are concerned with the expected value of interventions, I think you ought to communicate more about the relative expected value of the interventions and frame your summary of the interventions in a way that is less likely to mislead people about the relative expected value of each intervention.
I think the ideally informative way to both communicate the relative expected value of the interventions and hedge on your model uncertainty in the summary is to (1) provide your expected value estimate, (2) explain that you have high model uncertainty and one could arrive at a different expected value estimate with different assumptions, and (3) invite participants to adjust the Guesstimate and generate their own predictions.
Thanks, this is a good criticism. I think I agree with the main thrust of your comment but in a bit of a roundabout way.
I agree that focusing on expected value is important and that ideally we should communicate how arguments and results affect expected values. I think it’s helpful to distinguish between (1) expected value estimates that our models output and (2) the overall expected value of an action/intervention, which is informed by our models and arguments etc. The guesstimate model is so speculative that it doesn’t actually do that much work in my overall expected value, so I don’t want to overemphasise it. Perhaps we under-emphasised it though.
The non-probabilistic model is also speculative of course, but I think this offers stronger evidence about the relative cost-effectiveness than the output of the guesstimate model. It doesn’t offer a precise number in the same way that the guesstimate model does but the guesstimate model only does that by making arbitrary distributional assumptions, so I don’t think it adds much information. I think that the non-probabilistic model offers evidence of greater cost-effectiveness of THL relative to AMF (given hedonism, anti-speciesism) because THL tends to come out better and sometimes comes out much, much better. I also think this isn’t super strong evidence but that you’re right that our summary is overly agnostic, in light of this.
In case it’s helpful, here’s a possible explanation for why we communicated the findings in this way. We actually came into this project expecting THL to be much more cost-effective, given a wide range of assumptions about the parameters of our model (and assuming hedonism, anti-speciesism) and we were surprised to see that AMF could plausibly be more cost-effective. So for me, this project gave an update slightly in favour of AMF in terms of expected cost-effectiveness (though I was probably previously overconfident in THL). For many priors, this project should update the other way and for even more priors, this project should leave you expecting THL to be more cost-effective. I expect we were a bit torn in communicating how we updated and what the project showed and didn’t have the time to think this through and write this down explicitly, given other projects competing for our time and energy. It’s been helpful to clarify a few things through this discussion though :)
Thanks for doing this! Though it seems like you kinda buried the lede. Why isn’t this in the top level summary?
In expectation, THL is >100x better than AMF
In the median scenario, THL is about 2-4x more cost-effective than AMF
A 71% chance that THL is more cost-effective than AMF
Thanks for raising this. It’s a fair question but I think I disagree that the numbers you quote should be in the top level summary.
I’m wary of overemphasising precise numbers. We’re really uncertain about many parts of this question and we arrived at these numbers by making many strong assumptions, so these numbers don’t represent our all-things-considered-view and it might be misleading to state them without a lot of context. In particular, the numbers you quote came from the Guesstimate model, which isn’t where the bulk of the work on this project was focused (though we could have acknowledged that more). To my mind, the upshot of this investigation is better described by this bullet in the summary than by the numbers you quote:
In this model, in most of the most plausible scenarios, THL appears better than AMF. The difference in cost-effectiveness is usually within 1 or 2 orders of magnitude. Under some sets of reasonable assumptions, AMF looks better than THL. Because we have so much uncertainty, one could reasonably believe that AMF is more cost-effective than THL or one could reasonably believe that THL is more cost-effective than AMF.
Thanks! I appreciate your wariness of overemphasizing precise numbers and I agree that it is important to hedge your estimates in this way.
However, none of the claims in the bullet you cite give us any indication of the expected value of each intervention. For two interventions A and B, all of the following is consistent with the expected value of A being astronomically higher than the expected value of B:
B is better than A in most of the most plausible scenarios
On most models the difference in cost-effectiveness is small (within 1 or 2 orders of magnitude)
One could reasonably believe that B is better than A or that B is better than A
Extremely little information is communicated about the relative expected value of A and B by the above points, and what information is communicated misleadingly suggests that both interventions are quite close in expected value. Because EAs are concerned with the expected value of interventions, I think you ought to communicate more about the relative expected value of the interventions and frame your summary of the interventions in a way that is less likely to mislead people about the relative expected value of each intervention.
I think the ideally informative way to both communicate the relative expected value of the interventions and hedge on your model uncertainty in the summary is to (1) provide your expected value estimate, (2) explain that you have high model uncertainty and one could arrive at a different expected value estimate with different assumptions, and (3) invite participants to adjust the Guesstimate and generate their own predictions.
Thanks, this is a good criticism. I think I agree with the main thrust of your comment but in a bit of a roundabout way.
I agree that focusing on expected value is important and that ideally we should communicate how arguments and results affect expected values. I think it’s helpful to distinguish between (1) expected value estimates that our models output and (2) the overall expected value of an action/intervention, which is informed by our models and arguments etc. The guesstimate model is so speculative that it doesn’t actually do that much work in my overall expected value, so I don’t want to overemphasise it. Perhaps we under-emphasised it though.
The non-probabilistic model is also speculative of course, but I think this offers stronger evidence about the relative cost-effectiveness than the output of the guesstimate model. It doesn’t offer a precise number in the same way that the guesstimate model does but the guesstimate model only does that by making arbitrary distributional assumptions, so I don’t think it adds much information. I think that the non-probabilistic model offers evidence of greater cost-effectiveness of THL relative to AMF (given hedonism, anti-speciesism) because THL tends to come out better and sometimes comes out much, much better. I also think this isn’t super strong evidence but that you’re right that our summary is overly agnostic, in light of this.
In case it’s helpful, here’s a possible explanation for why we communicated the findings in this way. We actually came into this project expecting THL to be much more cost-effective, given a wide range of assumptions about the parameters of our model (and assuming hedonism, anti-speciesism) and we were surprised to see that AMF could plausibly be more cost-effective. So for me, this project gave an update slightly in favour of AMF in terms of expected cost-effectiveness (though I was probably previously overconfident in THL). For many priors, this project should update the other way and for even more priors, this project should leave you expecting THL to be more cost-effective. I expect we were a bit torn in communicating how we updated and what the project showed and didn’t have the time to think this through and write this down explicitly, given other projects competing for our time and energy. It’s been helpful to clarify a few things through this discussion though :)