Largely agree with these only one I would add to/ expand on is 2. There is both the vision/disentanglement type aspect of this but also having sufficient evidence that a particular type of research is worth doing. Maybe I’m just bad at 2 but I suspect the reason 1 is never reported as a constraint is it’s much easier to think of plausible exciting research projects and hard to confidently prove/disprove a project’s theory of change. Researchers in lots of disciplines come up with seemly good research questions (to them) all the time but very little research is actually high impact. So I suspect in practise most research organizations should be constrained by this more than they are.
I’m interested what you think is both the constraight most organizations would report and if you think this lines up with what their actual constraints are
GeorgeBridgwater
Awesome, sure that would be great experience for using similar collective action in other causes! Had similar thoughts when writing this, although we focus on animal advocacy in principle all our approach reports could be used for other asks in other cause areas. I’d be interested in takes from those working in these domains
Great idea, for the more risky actions that could be a good approach if there are aligned unions. Do you think this is something you could have got the unions you helped on board with? Either animal advocacy or other potentially impactful causes?
Awesome great post, fantastic to see variations of this intervention being considered. The main concerns we focused on with unguided vs guided delivery methods for self-help were recruitment costs and retention.
As you show in the relative engagement in recent trials retention seems to play out in favour of unguided given the lower costs. I think this is a fair read, from what I’ve seen from comparing the net effect size of these interventions the no less than half figure is about right too.
For recruitment, given the average rate of download for apps we discounted organic growth as you have. Then the figure cited for cost per install is correct (can see another source here) but for general installs. For the unguided self-help intervention, we want a subset of people with the/a condition we are targeting (~5-15%). We then want to turn those installs into active engaged users of the program. I don’t know how much these would increase the costs compared to the $0.02 - $0.10 figure. Plausibly 6-20 times for targeting the subgroup with the condition for installs (although even the general population may benefit a bit from the intervention). Then the cost of turning those installs into active users. Some stats suggest that on day one there is a 23.01% retention rate and then as low as 2.59% by day 30 (source, similar to your 3.3% real-world app data). So again looking at maybe a 4 to 40 times increase in recruitment costs depending. Overall that could increase costs for recruiting an active user by 24 to 800 times (Or $0.48 to $8).
Although these would be identical across unguided vs guided the main consideration then becomes does it costs more to recruit more users or to guide existing users and increase the follow-through/ effect size. Adding the above factors quickly to your CEA for recruitment cost gives a cost per active user of 9.6 (1.6 to 32) and cost-effectiveness for a guided of mean 24 (12 to 47) and unguided of mean 21 (4.8 to 66). Taking all of this at face value which version looks more cost-effective will depend a lot on exactly where and how the intervention is being employed. Even then there is uncertainty with some of these figures in a real-world setting.
For the unguided app model you outline I agree if successful this would be incredibly cost-effective. Although at present I’d still be uncertain which version would look best. Ultimately that’s up to groups implementing this family of interventions, like Kaya Guides, to explore through implementing and experimenting.
At Animal Ask we did later hear some of that feedback ourselves and one of our early projects failed for similar reasons. Our programs are very group-led, as in we select our research priorities based on groups looking to pursue new campaigns. This means the majority of our projects tend to focus on policy rather than corporate work, given more groups consider new country-specific campaigns and want research to inform this decision.
In the original report from CE, they do account for the consolidation of corporate work behind a few asks. They expected the research on corporate work to be ‘ongoing’ deeper’ and ‘more focused research’. So strategically would look more like research throughout the previous corporate campaign to inform the next with a low probability of updating any specific ask. The expectation is that it could be many years between the formation of corporate asks.
So in fact this consolidation was highlighted in the incubation program as a reason success could have so much impact. As with the large amount of resources the movement devotes to these consolidated corporate asks ensuring these are optimised is essential.
As Ren outlined we have a couple of recent, more detailed evaluations and we have found that the main limitations on our impact are factors only a minority of advisors in the animal space highlighted. These are constraints from other organisation stakeholders either upper management (when the campaigns team had updated on our findings but there was momentum behind another campaign) or funders (particular individual or smaller donors who are typicaly less research motivated than OPP, EAAWF, ACE etc.)
You can see this was the main concern for CE researchers in the original report. “Organizations in the animal space are increasingly aware of the importance of research, but often there are many factors to consider, including logistical ease, momentum, and donor interest. It is possible that this research would not be the determining factor in many cases”.
From the main body of the text: “Plant-based products represented 0·011 % of product unit sales in the pre-intervention period. This increased to 0·016 % during the intervention period and 0·012 % in the post-intervention period. Meat products represented 26·52 % of sales in the pre-intervention period, 26·51 % during the intervention period and 26·32 % in the post-intervention period. The remainder of sales were represented by non-meat products (73·47 % in pre-intervention and intervention periods, 73·67 % in the post-intervention period).”
One thing to flag on reading into this as evidence against plant-based sales leading to lower meat consumption is how much harder it will be to detect a significant effect on meat consumption. The background variation on those is much higher so to detect a significant effect from a campaign like Veganuary we would need a much larger total effect size. Even if the 0.05% of pre/post change in plant-based came 100% from meat I’d expect it still would not be significant.
That seems like the 80⁄20 of this would be appropriate for a lot of candidates. I guess I assume that a lot of EA candidates have a higher bar for claims made in typical fundraising material so would benefit from delving deeper into the numbers. This depends on how much trust you already have in organisaions. Where if you think groups are already assessed with enough rigor by funders, e.g they have a GiveWell recommendation, then the time cost of going through the numbers makes less sense. I think this would work best for meta-groups like the organisaion I work for Animal Ask or others like Animal Advocacy Careers, Charity Entrepreneurship, 80,000 hours, Rethink Priorities, Global Priorities Insitute etc.
Hey Sofia, Great idea. Groups have usually indicated they would spend <10% of the time we spend researching without our involvement, so this seems like a more viable idea than one may expect. There are some reasons this may not entirely cross-apply to the rest of our work. Such as concerns with groups anchoring too much to their more shallow research, which usually results in more optimistic assessments (Optimizer’s Curse). Or possibly a selection effect with the groups that are willing to do this being more likely to make better decisions. We are tracking the asks other similar organisations are using in the regions or areas we have worked in to. This gives us some sense of this, but a more direct experiment of this kind could be valuable. Particularly if we ran it with a few groups using different advocacy methods. We will look into the idea more as well as some of the other ways we could amend our pre/post surveys before we partner with the next group!
Hello Joel,
I agree that in hindsight a summary of each indicator would probably have been useful to provide the reader with an overall assessment given the information I reviewed in the report.
wellbeingmeasured=accuracy∗importance= (reliability∗cardinality)∗(validity∗wellbeingaccount)
That model is roughly the way I was thinking of this assessment, with validity and interpersonal comparisons being how much I would update on a perfectly accurate measure, and reliability giving some sense how wide the confidence interval would be from a real world measurement. The trade off of these, between groups of indicators and individual indicators, adds some nuance so that a single physiological measure is reliable but can vary due to numerous other factors but a combination of them allows us to measure the welfare benefit of things that health can’t capture easily. For example, it may be better to minimise disease rates instead of blood glucose levels if given no context but disease rates would be unable to assess the importance of different types of environmental enrichment.
If more people comment to express interest in an overview of each section, I am happy to invest the time to go back through the report to add in these sections.
I think the ideal system would have a single measure that perfectly tracks what matters, no?
I definitely agree, which is partially why I put an example of self-reports in humans (which are in my opinion as close to ideal as we can get) alongside the measures we have available in other animals. This is what I currently view as the best available (‘ideal’) system given the weaker methods available.
My last question is: what are y’all’s thoughts on making across species comparisons? This is the question that really interests me, and most of these indicators presented seem to be much, much more suitable to within species assessments of welfare.
In this context, many of these indicators struggle on cross species comparisons. Take cortisol for example, where different species have different cortisol levels, making it difficult to compare levels or even percentage changes across species. We can gain some sense of the relative importance of different improvements or events for an individual from the degree of change of an indicator. An example of this within an operant test could be a human showing a mild preference for social contact vs food compared to a fox who shows the opposite relationship. Yet, this still only gives us information within the range of their utility functions and doesn’t tell us how their ranges compare. It’s a challenging question and due to this we have mostly been deferring to Rethink Priorities’s work on moral weights and Open Philanthropy Project’s report on consciousness. At the moment, we approach this by using an assessment of an animal’s quality of life, to gauge how important an improvement is within an individual’s utility function, and then adjust this based on these considerations. However, I would be cautious about concluding that an ask is more promising if the deciding factors are based across species comparisons given the range of plausible views on the topic.
Thanks for the feedback
In these sorts of discussions, I don’t think comparing ourselves to the rest of the population is a great guide. It should probably be our base rate but many other factors can affect how income impacts our happiness.
If we look at the overall population the income level required to get the maximum benefit from consumption is pretty high. However, there is some evidence that for people who adopt voluntary simplicity can achieve greater life satisfaction on less income. Boujbel (2012) explanation for this is ‘that the control of one’s consumption desires is a significant mediator of the relationship between voluntary simplicity and life satisfaction among consumers who have limited financial resources’.
So then the question is can you reduce your consumption desires if you start life with high consumption? I’ve seen a few people achieve this and I think I’ve reduced consumption desires over time as well. But this is pretty weak evidence to make broader inferences so I don’t put too much weight on it.
I’d be much more interested in studying how income (specifically consumed rather than donated income) effects life satisfaction and value drift amongst EAs. I’d weight this much more than the general population for my own decision making. If I had to bet I would expect similar findings to Boujbel.
Excellent point, I was considering this when writing the report as it would be possible to use remote volunteers. This would make it a great way for EA university groups to volunteer their time and encourage additional engagement from members. Beyond a pure commitment device volunteers will need to be able to answer some basic questions about the from of psychotherapy they are delivering. However, the skill requirement for providing support is still very low and training would be short. One of the groups in the incubation program is looking into this more and I think it could be a really great model for giving more people an effective way to donate their time.
Something that could explain the public backlash is the large percentage of people who are so called ‘non-traders’ or ‘zero traders’ when asked to do time trade offs when weighting QALYs. About 57% of respondents don’t trade off any length of life for quality increases. As you note the public revealed preferences show they will trade off quality for quantity but when asked to actual think about this a lot of people refuse to do this. Which explain why a large proportion of the public would view an argument for an improved quality of life vs reduced life poorly. This finding is the same when looking at QALY vs $ trade offs with a large proportion of people unwilling to trade off any amount of money against the value of a life.
I would disagree with two steps in your reasoning one the relative importance of different animals but Cameron_Meyer_Shorb comment already covers this point. Although your conclusion would probably not change if you valued animals more highly making the combined effect of an american diet equal to one or up to maybe ten equivalent years of human life per year ( $430 dollars of enjoyment).
Instead, I think your argument breaks down when accounting for moral uncertainty where if you are not 100% certain in consequentialist ethics then almost any other moral system would hold you much more accountable for pain you cause rather than fail to prevent. Particularly if we increase the required estimate for $ of the enjoyment gained even if they are met. This makes it a different case to other altruistic trade offs you might make in that you are not trading a neutral action.
Another argument against this position is its effect on your moral attitudes as Jeff Sebo argued in his talk at EA global in 2019. You could dismiss this if you are certain it will not effect the relative value you place on other being and by not advertising your position as to not effect others.
I’ve slowly been updating towards lower expected WP returns to improved DO based on conversations I have had with Fish Welfare Initiative. It seem likely that more fish are in the lower end of welfare benefit for DO optimization because of the natural incentives that exist for farmers in regards to DO. Low DO levels increase mortality and fluctuation in air pressure can cause DO to plummet so farmers often use extra buffer. Therefore any fish suffering −40 WP from DO levels alone would probably die , I think log-normal best captures this. Thanks for pointing this out as i did not make it explicit in the report.
I think the third option is best to try to test. Apps like SmartMood could track the effect on your mood. I suppose the problem with this though is that something like eating a marginal apple will probably have very small effects (if any) and so practically you won’t actual be able to measure it with the method. Things like meditation and a 10 min walk I would guess would be measurable though.
I think the reason summing counterfactual impact of multiple people leads to weird results is not a problem with counterfactual impact but with how you are summing it. Adding together each individual’s counterfactual impact by summing is adding the difference between world A where they both act and world B and C where each of them act alone. In your calculus, you then assume this is the same as the difference between world A and D where nobody acts.
The true issue in maximising counterfactual impact seems to arise when actors act cooperatively but think of their actions as an individual. When acting cooperatively you should compare your counterfactuals to world D, when acting individually world B or C.
The Shapley value is not immune to error either I can see three ways it could lead to poor decision making:
For the Vaccine Reminder example, It seems more strange to me to attribute impact to people who would otherwise have no impact. We then get the same double-counting problem or in this case infinite dividing which is worse as It can dissuade you of high impact options. If I am not mistaken, then in this case the Shapley value is divided between the NGO, the government, the doctor, the nurse, the people driving logistics, the person who built the roads, the person who trained the doctor, the person who made the phones, the person who set up the phone network and the person who invented electricity. In which case, everyone is attributed a tiny fraction of the impact when only the vaccine reminder intentionally caused it. Depending on the scope of other actors we consider this could massively reduce the impact of the action.
Example 6 reveals another flaw as attributing impact this way can lead you to make poor decisions. If you use the Shapley value then when examining whether to leak information as the 10th person you see that the action costs −1million utilities. If I was offered 500,000 utils to share then under Shapley I should not do so as 500,00 −1M is negative. However, this thinking will just prevent me from increasing overall utilis by 500,000.
In example 7 the counterfactual impact of the applicant who gets the job is not 0 but the impact of the job the lowest impact person gets. Imagine each applicant could earn to give 2 utility and only has time for one job application. When considering counterfactual impact the first applicant chooses to apply to the EA org and gets attributed 100 utility (as does the EA org). The other applicants now enter the space and decide to earn to give as this has a higher counterfactual impact. They decrease the first applicant’s counterfactual utility to 2 but increase overall utility. If we use Shapely instead then all applicants would apply for the EA org and as this gives them a value of 2.38 instead of 2.
I may have misunderstood Shapely here so feel free to correct me. Overall I enjoyed the post and think it is well worth reading. Criticism of the underlying assumptions of many EAs decision-making methods is very valuable.
- Oct 11, 2019, 1:05 PM; 10 points) 's comment on Shapley values: Better than counterfactuals by (
- Sep 29, 2024, 7:33 PM; 4 points) 's comment on An Interactive Shapley Value Explainer by (LessWrong;
The openness of the EA movement to omnivores is a good point I had not considered before. Although this could probably be accomplished by not being in peoples face about it. I understand the reasoning that concludes that the strength of obligations to give to charity and for veganism are the same. However, I think there is one important distinction, we are causing harm. If we use the classic example of the child drowning in the pool. Not giving to charity is analogous to allowing the child to drown. Eating meat is analogous to drowning the child (or at least a chicken every couple of days). I think we should examine our actions through many ethical theory’s due to moral uncertainty. If we do so we can see that there tends to be an extra obligation not to do harm in many ethical theory’s. This means there is a distinction between not allowing someone to drown and drowning them. Thus I think there should be extra moral importance placed on first not doing the world any harm.
I think another potential cause I have at least observed in my self is risk aversion. EA organisations are widely thought of as good career paths which does make it easier to justify to others but also to your self. If I pursue more niche roles I am less certain that they will be high impact because I am relying on only my own judgment. This does justify some preference for EA organisations but I agree there is probably an over emphasis on them in the community.
Some pretty unintuitive results for some of these. I would not have assumed that a dairy cow would have a worse estimate for welfare score than a beef cow. The method seems pretty logical so I think it is more accurate than just my intuition. I guess my concern would still be with inter-species comparisons of utility, given their possible varying levels of sentience. How is CE approaching this problem? With the usual neuron amount or is there a better way of doing it? I suppose that would just have to be something you have to concede a large margin of error for when comparing between species.
Thought this blog and the surrounding community would be a useful resource for EA’s. I have already shared it with a few people.
I definitely agree with Raemon that having your own resources allows you greater flexibility but I would go one step further in the aim to amass enough money that I do not need paid work. This allows you complete flexibility with your time over your remaining lifespan and can, therefore, work on any project that seems valuable, or turn down a salary for jobs at EA orgs. I am aware that EA orgs are time sensitive in terms of donations. I think, the estimated preference for immediate donations instead of 1 year on was somewhere between 10-12%. Dependent on length of time to attain ‘early retirement’ (ER) this could mean there is a higher expected value on donations over savings leading to flexibility. I think overall taking the possibility of ER into consideration is important it frames any decisions you make in the present. You can spend money now to free up time but if you save that money It will help to free up all your time eventually.
Re-the impact of groups on corporations. I wrote a piece on this here.
I do think some companies are acting based on being more aligned such as high-end brands like Waitrose in the UK. Even in these cases, it can be a kind of getting things over the line scenario, where talking to them is the small nudge that results in counterfactual changes.
But as FAI mentions the cost of “bad cop” actions to companies seems significant. If you’re looking for RCT-level evidence of this unfortunately we don’t have it. This mostly looks at case studies, broadly how companies value their reputations and how comparable corporate scandals affect market evaluation and performance. I’d be interested to see this replicated specifically for cage-free. Taking historic or upcoming campaigns by working with groups for intel and tracking their effect on companies.
I think attribution is broadly a fair concern though and could affect many interventions outside of anything you are directly paying for e.g. any lobbying-based interventions would have the same concern regardless of cause area.