How should we assess very uncertain and non-testable stuff?
There is a good and widely accepted approach to assessing testable projects—roughly what GiveWell does. It is much less clear how EA research organisations should assess projects, interventions and organisations with very uncertain non-testable impact, such as policy work or academic research. There are some disparate materials on this question on blogs, Open Phil’s website, on 80k’s website, in the academic/grey literature etc. However, this information is not centralised; it’s not clear what the points of agreement and disagreement are; lots of the organisations who will have thought about this question will have insights that have not been shared with the community (e.g. maybe CSER, FHI?); and the mechanisms for sharing relevant information in the future are unclear.
Ultimately, it would be good to collate and curate all the best material on this, so that EA researchers at separate EA orgs would have easy access to it and would not have to approach this question on their own. As a first step, we invite people who have thought about this question to discuss their insights in the comments to this post. Topics could include:
How far should we use quantified models?
e.g. The Oxford Prioritisation Project used quantified models to assess really uncertain things like 80k and MIRI.
Open Phil doesn’t appear to do this (they don’t mention that often in their public facing docs.)
What role should the Importance/Neglected/Tractable framework play?
Should it be used to choose between interventions and/or causes?
Should quantitative models be instead of ITN?
How quantified should the ITN framework be? As quantified as 80k’s? More intuitive?
What are the key takeaways from the history of philanthropy, and the history of scientific research?
What’s the best way to assess historical impact?
Process tracing or something like it?
What are the main biases at play in assessing historical impact?
Who do you ask ?
Is hits-based giving the right approach and what follows from it?
How relevant is track record, on this approach? Sometimes Open Phil takes account of track record, other times not.
Should we favour choosing a cause area and then making lots of bets, or should we be more discerning?
What are the most important considerations for assessing charities doing uncertain-return stuff?
Strength of team
Current strategy
Potential to crowd in funding.
What are the best theories of how to bring about political change?
How much weight should we put on short to medium-term tractability?
Given the nonlinear nature of e.g. political change, current tractability may not be the best guide.
Are there any disciplines we could learn from?
Intelligence analysis.
Insurance (especially catastrophe insurance).
Thanks, John and Marinella @ Founders Pledge.
In academic research, government and foundation grants are often awarded using criteria similar to ITN, except:
1) ‘importance’ is usually taken as short-term importance to the research field, and/or to one country’s current human inhabitants (especially registered voters),
2) ‘tractability’ is interpreted as potential to yield several journal publications, rather than potential to solve real-world problems,
3) ‘neglectedness’ is interpreted as addressing a problem that’s already been considered in only 5-20 previous journal papers, rather than one that’s totally off the radar.
I would love to see academia in general adopt a more EA perspective on how to allocate scarce resources—not just when addressing problems of human & animal welfare and X-risk, but in addressing any problem.
Another big area you didn’t mention is Superforecasting, prediction markets and that kind of thing.
good shout—does anyone have any thoughts on this that aren’t well-known or disagree with Tetlock?
This isn’t a unique thought, but I just want to make sure the EA community knows about Gnosis and Augur, decentralized prediction markets built on Ethereum.
https://gnosis.pm/
https://augur.net/
Here’s a link to the talk for those without Facebook: https://www.youtube.com/watch?v=67oL0ANDh5Y (go to 21:00)
Here’s a written version by 80,000 Hours: https://80000hours.org/articles/problem-framework/
This is sort of a meta-comment, but there’s loads of important stuff here, each of which could have it’s own thread. Could I suggest someone (else), organises a (small) conference to discuss some of these things?
I’ve got quite a few things to add on the ITN framework but nothing I can say in a few words. Relatedly, I’ve also been working on a method for ‘cause search’ - a ways of finding all the big causes in a given domain—which is the step before cause prio, but that’s not something I can write out succinctly either (yet, anyway).
I have organized retreat/conferency things before and would probably be up for organizing something like this if there was interest. I can contact some people and see if they think it would be worthwhile, I am not sure what I expect attendance to be like though (would 20 people be optimistic?)
I have a lot of thoughts on cause search, but possibly at a more granular level. One of the big challenges when you switch from an assessing to a generating perspective is finding the right problems to work on, and it’s not easy at all.
Concur. If only we were meeting in the pub in, say, 1 hour to discuss this...
I think there’s one happening in London in November that is discussing questions of this nature—it may be worth seeing if they will add it to the schedule if it’s not already there.
I think splitting off these questions would balkanise things too much, making it harder for people interested in this general question to get relevant information.
They do, though it’s usually not the first thing they mention e.g. here: https://docs.google.com/document/d/1DTl4TYaTPMAtwQTju9PZmxKhZTCh6nmi-Vh8cnSgYak/edit
This seems like the right approach. Quantify but don’t hype the numbers.
I definitely agree that information on these topics is ripe for aggregation/curation.
My instinct is to look to the VC/startup community for some insight here, specifically around uncertainty (they’re in the business of “predicting/quantifying/derisking uncertain futures/projects”). Two quick examples:
Micro Models—They use a gate/funnel pattern to assess uncertainty over time. e.g. Here’s a meta framework for startup lifecycle from the entrepreneurs perspective: https://d3ansictanv2wj.cloudfront.net/lean_0505-8a1e4251dd6cd9f28b38a2eb6354a1be.png. e.g. Here’s a surprisingly helpful grid with metrics for VCs to invest in SaaS startups: https://www.quora.com/What-are-average-revenues-MRR-or-ARR-for-SaaS-companies-at-the-time-of-the-Series-A/answer/Christoph-Janz?srid=zROG
Macro Models—They use macro models to predict how startups play into the context of future trends. e.g. The abstraction of trust — https://medium.com/@alexdanco/emergent-layers-an-introduction-f91c3cbe0175 or Aggregation Theory — https://stratechery.com/2015/aggregation-theory/.
I would expect an “EA-focused uncertainty model” to include gates that map a specific project through time given models of macro future trends.
One difference between for-profits and non-profits is that the worst case scenario for a for-profit is going out of business. The worst case for a non-profit is unintentionally causing harm. Someone mentioned that there aren’t a lot of historians in EA—I wonder if this is because the history of attempts to do good is discouraging.
If you haven’t come across it yet, you might like to look at Back of the Envelope Guide to Philanthropy, which tries to estimate the value of some really uncertain stuff.
I started putting together a reading list, but it’s going to take longer than I thought. To avoid making the perfect the enemy of the good, and to toot my own horn, I thought you might like to read How do EA Orgs Account for Uncertainty in their Analysis? by myself, Kathryn Mecrow, and Simon Beard.
Yep yep, happy to! A couple things come to mind:
We could track the “stage” of a given problem/cause area, in a similar way that startups are tracked by Seed, Series A, etc. In other words, EA prioritization would be categorized w.r.t. stages/gates. I’m not sure if there’s an agreed on “stage terminology” in the EA community yet. (I know GiveWell’s Incubation Grants http://www.givewell.org/research/incubation-grants and EAGrants https://www.effectivealtruism.org/grants/ are examples of recent “early stage” investment.) Here would be some example stages:
Stage 1) Medium dive into the problem area to determine ITN. Stage 2) Experiment with MVP solutions to the problem. Stage 3) Move up the hierarchy of evidence for those solutions—RCTs, etc. Stage 4) For top solutions with robust cost-effectiveness data, begin to scale.
(You could create something like a “Lean Canvas for EA Impact” that could map the prioritized derisking of these stages.)
From the “future macro trends” perspective, I feel like there could be more overlap between EA and VC models that are designed to predict the future. I’m imagining this like the current co-evolving work environment with “profit-focused AI” (DeepMind, etc.) and “EA-focused AI” (OpenAI, etc.). In this area, both groups are helping each other pursue their goals. We could imagine a similar system, but for any given macro trend. i.e. That macro trend is viewed from a profit perspective and an impact/EA perspective.
In other words, this is a way for the EA community to say “The VC world has [x technological trend] high on their prioritization list. How should we take part from an EA perspective?” (And vice versa.)
(fwiw, I see two main ways the EA community interacts in this space—pursuing projects that either a) leverage or b) counteract the negative externalities of new technologies. Using VR for animal empathy is an example of leverage. AI alignment is an example of counteracting a negative externality.)
Do those examples help give a bit of specificity for how the EA + VC communities could co-evolve in “future uncertainty prediction”?
I think ITN is good for scoping problems, but many think it shouldn’t be used for interventions. So for interventions, I think quantitative models like this one are valuable.
I think an important one is: How likely is the project to reduce the uncertainty of the return?
E.g. will it decide a crucial consideration
Edit to give more detail:
Resolving a crucial consideration increases the value of all your future research massively. Take for example the question of whether that will be a hard or slow take off. Hard take off favours AI safety now, whereas soft take off favours building political and social institutions that encourage cooperation and avoid wars. As they both have humanity’s future on the line they are both equally massively important, conditioned on them being the scenario that might happen.
Resolving the question (or at least driving down the uncertainty) would allow the whole community to focus on the right scenario and get a lot better bang for their buck. Even if it doesn’t directly address the problem.
I’ve been pointed to this as well—http://lesswrong.com/lw/hcq/estimates_vs_headtohead_comparisons/
I think quantifying tends to be the right approach almost always when it’s an in-depth exploration based on 80,000 hours’ reasoning. That said, when it is used in a public summary of reasoning it can give a false pretense of certainty in cases where the estimate is very uncertain, leading to problems like those described here in animal advocacy. I think the best solution is to reason quantitatively when possible but to keep that in a public document linked to in any announcements and not to highlight the quantitative estimates in a way that often misleads people.
Another important step to take on this issue is probably to distinguish between problems that are unmeasurable and those that simply have not been measured yet. On those that have not been measured yet, we should try to measure them, and that might take some creativity by ingenious researchers.
thanks