Slight correction: The Charity Entrepreneurship program will be based in London, UK this year.
When I was writing this, I was mostly comparing it to other highly time consuming activism (e.g. many people are getting a degree hoping it will help them acquire an EA job). In terms of being the optimal thing for EA organizations to look for, I do not really have a view on that. I was more so hoping to level the understanding between people who have a pretty good sense that this sort of information is what you need, and people who might think that this would be worth far less than, say, a degree from a prestigious university.
Ok given multiple people think this is off I have changed it to 3 hours to account for variation in application time.
My sense is they already had a CV that required very minimal customization and spent almost all the time on the cover letter.
It came from asking ~4 successful employees who where hired
The following is a rough breakdown of the percentage of people who were not asked to move on to the next round in the Charity Science hiring process. These numbers assume one counterfactual hour of preparation for each interview and no preparation time outside of the given time limit for test tasks.~3* hour invested (50%) - Cover letter/resume~5 hours invested (20%) - Interview 1~10 hours invested (15%) - Test task 1~12 hours invested (5%) - Interview 2~17 hours invested (5%) - Test task 2~337 hours invested (2.5%) - paid 2-month work trialHired (2.5%)So, 95% of those not hired spend 17 hours or less, 85% spend 14 hours or less, and 70% spend 5 hours or less.
*changed from 1 hour to 3 hours based on comments
The endline goal of any piece of evaluation criteria is to be able to be used to best predict “good done”. I broadly agree that one criteria factor is unlikely to rule in or out an intervention fully (including limiting factor—it was one of four in our system). If we know a criteria that was that powerful there would be no need for complex evaluation.
Although limiting factor is not a pure hard limit I do not think this changes its usefulness much; an intervention might be low evidence, and in theory multiple RCTs could be done to improve this, but in practice if there is say a limiting factor on funding (such that multiple RCTs could not be funded) the intervention might be indefinitely low evidenced even if in theory evidence is not an independent of movement factor. It seems fairly clear that all things being equal running an intervention will be easier than running an equivalent intervention that also requires you to build a field of talent or otherwise work on a limiting factor. In principle I think this could be put into a more numerical form (e.g. included in CEA), but I think in practice this has not been done. Historically maybe the closest is different levels of funding gaps that Givewell has put for there top charities, but that is mostly considering a single possible limiting factor (funding). I would love to see more models on limiting factor and think it would be a natural next step in the current EA talent vs funding conversations.
A different way to think about this question is do we think problem scale or limiting factor are better predictors of areas where the most good can be done? I pretty strongly disagree that problem scale is more important than the limiting factor that will hit an intervention. Theoretically scale of the problem is a harder limit but that doesn’t really matter if in practice an intervention is never capped by it. We ended up looking at quite a number of charities to consider what was stopping them (including GiveWell and ACE recommendations) and none of them seemed to be capped by problem scale, they had all been stopped by other limiting factors far before that became an issue (for example, with AMF it was funding and logistical bottlenecks not the number of people with malaria). I think this is even true for the specific case of wild suffering interventions. The absolute number of bugs does not matter much when considering ethical pest control so much as the density per hectare of field or the available funding for a humane insecticides charity. You could imagine a world where the bug populations of colder locations (such as Canada and Russia) where close to 0 and it would do very little to affect the estimated good done- broadly due to having a ton of work to do in warmer locations before one would expand to Canada and likely hitting many limiting factors before expanding that far. How soon these problems hit would be more predictive of impact than if there were twice or half as many bugs in the world as there are now.
I think historical evidence like “if this was not done X would not never have happened” is not a very strong argument unless some research is done systematically and compares both the hits and misses that occured (e.g. there where a lot of issues that were attempted to be founded but never got traction at that same point in time). To take a more clear example you could look a friend who won the lottery, and although clearly he benefited from his ticket it still would have been the wrong call from an expected value perspective to buy it, and certainly would not suggest you should buy a lottery ticket we have to be careful of survivorship bias. Mainly we are looking at factors that are predictive of something having the most impact and singular examples do not describe much about field building vs making quicker progress on a more established field. Although I would be really interested in more systematic research in this area.
Good idea, I added CE to the first use of “we”.
Really interesting post, but I do want to flag a big concern I have in the comparative calculation. Broadly, estimated effects are almost always just going to be way more positive than well studied effects. For example if you estimated GD’s impact using standard income vs happiness adjustment measures (e.g. the value of double someone’s income on their happiness) you end up at a much higher number than the RCT results. I think this sort of thing happens pretty consistently and predictability. For example, it would be really easy to imagine Strong Minds treatments are different enough from the most studied ways of doing CBT for the treatment effects to only persist 1 year (which would reduce the cost-effectiveness to about equal), and it’s easy to imagine several such changes (almost all going in a more pessimistic direction).On the flip side, there has been extensive research, evaluation and huge numbers of charities founded in the global health space leading to a comparatively very small number of super strong charities, many of which are explicitly focused on cost-effectiveness/impact, etc. This same work (as far as I know) has not been done in the mental health area. In many ways, you are comparing a very strong global poverty charity to a much more average mental health charity. Thus personally I would not necessarily need to see a current mental health charity beating GiveWell’s best to be convinced the area as a whole could be very effective (if some strong research, evaluations and impact focused charities) were founded or identified in the area. Given my current work with Charity Entrepreneurship, the main case I am considering is if a new well researched and impact focused charity in mental health could be competitive with GiveWell top charities in effectiveness. I feel like the posts you have made over time have made this claim seem pretty plausible.
Our team has fairly recently done pretty similar work to what you are describing. You can see it here http://www.charityentrepreneurship.com/blog/from-humans-in-canada-to-battery-caged-chickens-in-the-united-states-which-animals-have-the-hardest-lives-results
When we looked at larger groups like fish or bugs we looked for species that were a) more studied and b) more populous. For example, for bugs this tended to be ants, bees, flies, and beetles. Overall though we tried to get a score that we felt would be consistent with “a random unknown bug is killed by an insecticide. What was the welfare score of that bug?”
We only set aside enough time to cover a certain number of animals, and we did not think looking at most regional differences was as important as covering more animals. We will be releasing a table with some specific welfare changes (e.g. animals raised without any physical alterations) which will shed a bit of light on some regional differences. That being said, I expect the broadest level conclusions (e.g. prioritizing fish) to hold across different locations.
Thanks. It indeed looks like that was taken from a report on the breeders.
Wild rat indeed includes rats that live in cities and apartments (as long as they are not domesticated/pet rats). We definitely considered causes of death by humans (which for rats was quite a sizable percentage of their deaths) and our next report is in fact on ethical pest control, including possibilities like more ethical rodenticides and legal changes to move people from sticky to snap traps.
So on #1, there have been some discussions of this but out team was not convinced of the arguments enough to include a factor involving it into our analysis. You can see more here https://www.lesswrong.com/posts/2jTQTxYNwo6zb3Kyp/preliminary-thoughts-on-moral-weight and on the links at the bottom of that post. It would change things quite a bit. We have not done the calculation but off the top of my head I would expect it would impact insects most significantly with other animals moving up or down a category e.g. cows might move to mid but I would not expect them to move to high.
On #2, indeed our research is mostly focused around which charities should be founded in the animal space. That being said, I do think it cross applies. For example, I would far prefer someone to eat beef and give up chicken than the opposite. For giving up different food categories I think it would go something like Fish > Chicken > Eggs > Pork > Beef > Milk > Cheese in order of importance based on both the animal welfare and the amount of animals it takes to form a meal (e.g. 1 chicken or 0.01 cows).
In terms of other animals that could be quite net positive, large herbivores and predators at the top of their food chain with relative abundance of food (e.g. elephants, moose, whales and dolphins) would be my guess, but we did not go deep into any of those animals. Some domestic animals (e.g. well treated dogs and cats) also seem plausible to have pretty net positive lives.
Sadly I was pretty specific with what data I was going to publish and this is it. I suspect that identities of some people could be determined with the full raw data so can understand why people would not want it published.
Re:biological markers, the ideal situation would be multiple markers in both the animal in an ideal life vs their current life vs a perfectly unideal life, then scores would be given based on how their current life compares. In practice, sometimes we have found data on a happy life vs a standard life for an animal and can get some sense of how far away these are from each other, but often we have found no applicable data at all for this section. Our reports are very time capped (5 hours or less depending on the importance of the animal), so we do not dive deep into the mechanisms.
Humans from different situations will be ranked as well. I agree having them as a comparative measure for cross-species comparison allows for much easier intuition checks.
Examples coming soon. We are currently aiming to have ~15 done and published by 10/7/18. Our full goal of this project is to create a consistent systematic baseline to quantify the benefits of various interventions which would then allows us to compare specific charity ideas and rank what might be the best few to found within the animal movement.
http://everydayutilitarian.com/essays/how-much-suffering-is-in-the-standard-american-diet/ is the closest thing to calculating the value of going vegetarian that I know.
Yes indeed, that is the next step. We plan on applying this system to ~15 animal situations and doing a 1-5 hour report on each. This would be both for different animals (e.g. wild rat and factory farmed cows) and different welfare situations for the same animal (e.g. a report each for battery caged laying hens vs enriched cage laying hens)
On biological markers specifically, from the research we have done so far, it’s very hard to find any consistent biological markers, not to mention situations where we have a bunch that we can cross compare on the same animal. Generally a good score might look like “some cortisol tests have been done on rats in an ideal living situation vs wild rats and the cortisol levels are about the same” where if the same study was done but the cortisol levels were much higher in the wild rats, that would be an indication of lower wild rat welfare.