Project: A web platform for crowdsourcing impact estimates of interventions.

TL;DR: Create a web platform for computing live-updated impact estimates for different interventions, based on crowd-sourced estimates of assumptions and on explicit (simplified) theory-of-change causal models. Encourage founders, funders, skeptics and ordinary EAs to participate in these “markets”.

Please tell me why you think this might not work!

Please tell me your ideas to make this even better!

Overview

Estimates on important quantities

Through wisdom-of-the-crowds (and ideally, actual prediction markets), we can get better estimates of numbers important for the EA movement. By improving the estimates of our assumptions, we can improve the effectiveness of the EA movement as a whole.

Here is an image of a prediction market interface, with the title “Estimated impact of Milestone 1. Reduction in number of chickens in battery cages in 2026 compared to counterfactual.” It shows a user placing a gaussian distribution over top of the aggregate distribution of other’s predictions.

Formal theory-of-change models

By formalizing (and sharing) theory-of-change models for interventions, we make them easier to critique and explain.

If we use an actual (simple) programming language for these models, we can compute impact estimates in a standard way.

If we get the EA community to use a lot of these (and maybe weight people’s predictions somehow?), then we get other benefits as well, such as the ability

In addition to the direct benefits of better estimates, encouraging people to put numbers on their conversations and ideas encourages epistemic virtue in the EA community.

Also, having a database of impact estimates of interventions, lets us easily compare across cause areas.

In detail, here’s what that the platform would have:

  1. The ability to create “estimates” on the site. All estimates allow bets to be posted as a critique/​adjustment mechanism (market?). (Including the ability to input a probability distribution). Examples of kinds of numbers we would see are:

    1. “current number of chickens in battery farms”

    2. “expected QALYs of a future with a correctly aligned AI”

    3. “chance of AGI risk from

    4. “% reduction in ‘AGI risk scenario B’ given success of this intervention”

  2. A visual scripting language or simple programming environment for creating simple causal models that describe a theory of change. While creating the model, the user might also new estimates (like above), seeded with the creator’s prediction.

    1. e.g. the creator would estimate “person-hours required to reach milestone 1 of this intervention”, which would be posted on the site as a thing for other users to bet on.

  3. Allow comments to be posted attached to people estimates, to give feedback to the creators.

    1. e.g. “I’m placing this estimate for these reasons, and I’ll change it after we discuss it. I think that your project requires many more person-hours than you have estimated to reach even your first milestone.”

  4. Allow fixing the values of certain numbers and re-generating a visualization the resulting estimates of different interventions.

    1. e.g. “Here’s a link to your intervention if we assume my estimate of the person-hours required to reach milestone 1. As we can see that kind of kills this as an intervention compared to just putting the person hours into <other scalable intervention here>”

  5. Have some way of sharing your reputation score, as determined by how the various aggregated guesses moved over time towards or away from your particular guesses.

    1. Sharing this score probably shouldn’t be the default because it would be too tied up in people’s self worth, but having a high score would be a good signal.

Fake money or real money?

My initial thought on this is to just let people make guesses on individual markets, but the lack of a currency or actual incentive might make some people skeptical that the aggregated results are meaningful.

However, there’s a recent proposal that says it’s legal to do prediction markets with real money, conditional on that money being donated to charity: “Predicting for Good: Charity Prediction Markets” (akrolsmir, harsimony)

Using real money has some benefits, ie. that it definitely matches what’s happening with the actual spending of the EA movement. People doing their donations via the site allows others to explicitly hedge against your donations if they think that’s better in terms of impact.

I’m confused about how to make estimates using real money on random numbers like “% reduction in ‘AGI risk scenario B’ given success of this intervention”

What’s required for an MVP?

The core valuable project is three things:

  1. A web interface for creating and estimating on numbers/​assumptions.

    1. For some kinds of estimates, the interface needs to support 2D distributions.

    2. E.g. Probability of smarter-than-human AGI each year has 3 axes—chance, probability weight, and time. The distribution is a 2D surface in 3D.

  2. A intuitive (visual?) programming tool for creating causal models (probabilistic programs) using those numbers.

  3. A back-end for keeping the results up-to-date. This involves aggregating the guesses, and re-deriving the resulting facts by running the causal programs through a probabilistic programming engine (e.g. something that does markov chain monte carlo)

    1. Care needs to be taken to manage “time” in this system in order for results to be as live as possible but also correct.

In my experience of web development, this would probably take about one person-year to get a prototype to the point where it’s both complete and pleasant enough for others to use. (Wouldn’t it be nice if you could disagree on that number right here and now?! )

Extensions to this project:

  • Native apps for good mobile experience.

  • Easy sharing the estimates on certain numbers. (either live or “as of now”)

    • Embed into websites and the EA Forum /​ LW /​ AF

  • Use real money, and somehow sending it to charities in the optimal way based on the calculated impact estimates.

  • Attach comments on bets to give more detailed feedback

    • Prompt commentors (and voters on the comment) to update their bet if the issue is addressed.

    • Maybe you’re not allowed to comment or vote on a comment until you make a bet!

Ideas for why might this not work?

  1. Models for why interventions might or might not work could be too complicated to reasonably put into an explicit program.

    1. For example, they might require putting many hours in, or require a high programming skill.

    2. Or, they might require sampling from complex probability distributions that makes it too hard to use a probabilistic programming language for this purpose.

  2. Even if the aggregation works, and the site is intuitive, people might not use it.

    1. For example, people might not want to be critiqued in this way.

  3. It might take too much work to do.

Extra content post-edit

Here’s a list of links and people I have found on this topic, sneaked in before I hopefully get to 25 upvotes and get onto the Nonlinear library feed.

The main concern from all of these efforts is that it’s hard to get people to use this platform. I think that yes,it needs to be really good to get any adoption.

Final words

I’ve just come up with this idea. Please comment below if it’s bad, or especially if someone’s already doing it and I can just join them.