We’ve gone through countless iterations with this announcement post that usually took the shape of one of us drafting something, us then wondering whether it’s too complicated and will cause people to tune out and ignore the contest, and us then trying to greatly shorten and simplify it.
There’s a difficult trade-off between the high-fidelity communication of our long explainer posts and the concision that is necessary to get people to actually read a post when it comes to participating in a contest. Our explainer posts get very little engagement. To participate in the contest it’s not necessary to understand exactly how our mechanisms work, so we hope to reach more people by explaining things in simpler terms without words like “ex ante” and comparisons to constructed counterfactual world histories.
Like, grocery shopping would be a terrible experience if every customer had to understand all the scheduling around harvest, stocks and flows between warehouses, just-in-time delivery, pricing in of some expected number of produce that expire before they’re bought, etc. If anyone who wants to use impact markets has to spend more time up front to learn more about them than the markets are worth to them, that’d be a failure.
This is exacerbated in this case where a submitter has a < 100% chance to get a reward of a few hundred dollars. That comes down to quite little money in expectation, so we’ve been trying hard to make the experience as light on the time commitment as possible while linking our full explainer posts at every turn to make sure that people cannot miss the high-fidelity version if they’re looking for it. Once we have bigger budgets, we can also ask people to engage more upfront with our processes.
That said, we’ve thought a lot about the bolded key sentence “morally good, positive-sum, and non-risky.” We hope that everyone who submits will read it. By “non-risky” we mean “ex ante non-risky.” We hoped that the term captured that as it’s not common to talk about “risks” ex post. Even in sentences like “the Cuban Missile Crisis was risky,” the sentence doesn’t say that the event is a risk for us today after the fact but that, at the time when it was happening, it was risky.
But I’ll ask Dony to go over the post again and see if we can clarify this in a place where it doesn’t cause more confusion than it resolves. Maybe my bolded text below can be inserted below the first sentence that you cited.
For now, let me reiterate for every potential submitter reading this:
We will value impact according to Attributed Impact in its latest version at the time, so if writing your post would’ve been net negative in expectation before you wrote it (ex ante), it cannot be valued positively at any later time! The ex ante expected value is the ceiling of any potential future valuation of the impact, regardless how great it happens to turn out.
Every submitter also has to answer questions like “What positive impact did you expect before you started the project? What were unusually good and unusually bad possible outcomes? (Please avoid hindsight bias and take the interests of all sentient beings into account.)” before we will buy any of the impact. (I should reword that a bit, maybe, “What positive impact was to be expected …,” to make it fit with Attributed Impact.)
Here is a section (verbatim) that I originally wrote for the post that we cut entirely for length:
Downsides
Most of the problems that impact markets might cause are detailed in Toward Impact Markets.
We are particularly concerned with the following:
Issuers might be incentivized to:
try many candidate interventions, some of which might backfire terribly, but then only issue certificates only in the rare interventions that succeeded,
try many candidate interventions, some of which might be terrible for some moral systems, but issue the certificates under different aliases and sell them to different retro funders,
try an intervention many times, usually with disastrous results, but issue an impact certificate only for the rare iteration of the intervention that succeeded,
do something good once but then reframe it slightly to sell the impact from it multiple times to different people on different marketplaces,
compete with other issuers for funding by badmouthing them or withholding resources from them when otherwise they would’ve collaborated,
pander to the perceived preferences of the retro funders even in cases where the issuers have a clearer picture of what is impactful,
generate externalities for individuals that are not themselves represented on the market and who the retro funders are not aware of,
issuers using the markets to issue disguised threats against retro funders.
Investors might be incentivized to:
do little research and just invest large sums into a wide range of projects regardless of whether they’re likely to backfire on the off-chance that (1) one of them actually turns out good or (2) at some point in the future there will be a very rich retro funder that will think that a project turned out good,
invest mostly in things that are highly verifiable to avoid the ambiguity about the purview of certificates that comes with lower levels of verifiability, thereby disadvantaging some interventions for reasons unrelated to their impact,
actively trade certificates to the point of creating a lot of noise that distracts issuers from their object-level work,
do 1.e. and 1.f. above.
Retro funders might:
get scammed by some of the above tricks,
abuse their power by incentivizing projects that are disastrous for some moral systems.
We are optimistic that the brunt of these are solvable in a mature impact market. We don’t have a fully general mechanism but a range of incremental ones. Most of them can be summarized as an attempt to facilitate moral trade on a financial market:
Issuers:
commit to and justify their actions according to an operationalization of impact called Attributed Impact according to which an action that is net negative in ex ante expectation can never be positive in value even if it so happens to turn out well,
can sell only impact in classes of actions that are very unlikely to be extremely harmful, namely articles on the EA Forum (and at a later stage maybe other similar artifacts),
can sell only impact in classes of actions that have passed multiple rounds of vetting – for example in this case because the moderators of the EA Forum allowed the post and because we allowed its certificate to be issued on our platform,
can, conversely, sell impact from exposés of other certificates where the issuers cheated in some fashion to hide negative externalities actual or probabilistic,
con, conversely, sell impact from articles that changes the evaluation of the impact of other certificates,
can, conversely, sell impact from articles detailing new problems of or attack vectors against impact markets.
Investors:
are incentivized by retro funders just enough that those who add information to the market by making good predictions are profitable.
Retro funders:
should commit to Attributed Impact to push issuers and investors to commit to Attributed Impact too, thereby averting negative externalities and threats,
have the option to delegate the decision-making or the prefiltering of funding opportunities to us,
have the option to pivot entirely to retro funding, which should free up so much staff time that they can build expertise in recognizing exploits,
have at some point the support of “the pot,” an investment mechanism that acts as a semi-automated retro funder and reinforces the Schelling point of Attributed Impact.
The remaining problems are mostly related to (1) imperfections in the implementation of these solutions and (2) flaws in the retro funder alignment. If a really generous retro funder joins the market who is unconcerned with moral cooperation or cheating and has enough capital to spend, then impact investors are ready to stay invested in countless projects for decades until the unaligned investor arrives, then that retro funder can have a bad influence on the market even when people merely expect the retro funder to join but when it hasn’t happened.
We don’t think that there is a mechanism that can prevent this from happening because anyone is already free to retroactively reward whoever they like. But we recognize that by writing about impact markets and by running contests like these, we’re making the option more salient.
We want to hit the right balance between minimizing the opportunity costs from delaying the implementation of impact markets and minimizing the direct costs from harm that impact markets might cause. There are those who think that we have an “extreme focus on risks” and those who think that we’re rash for wanting to realize impact markets at all. We would love to get your opinion on where we stand on this balance and how we can improve!
There’s a difficult trade-off between the high-fidelity communication of our long explainer posts and the concision that is necessary to get people to actually read a post when it comes to participating in a contest. Our explainer posts get very little engagement. To participate in the contest it’s not necessary to understand exactly how our mechanisms work, so we hope to reach more people by explaining things in simpler terms without words like “ex ante” and comparisons to constructed counterfactual world histories.
After this contest, it will still be the case that most people will be more likely to read and use instructions that are short and simple. It may be very hard to later “fix” the influence that posts like the OP has on potential future retro funders. Simpler instructions are more prone to become a meme. Therefore, retro funders may predict that some (most?) future retro funders will use the simple “buy likable impact” rule rather than the “adhere to the safety solutions in the Toward Impact Markets post” rule, and thus be incentivized to follow the simple rule themselves. Posts like the OP risk pushing everyone towards the Schelling point of “retro funders buy likable impact”. (All this becomes more worrisome if you or someone else in EA ends up launching a decentralized impact market).
Regarding the claim that “articles on the EA Forum” are “very unlikely to be extremely harmful”: EA Forum posts can disseminate info hazards that can be extremely harmful. (And this does not seem very unlikely, considering that the ideas that are discussed on the EA Forum are often related to anthropogenic x-risks.)
Hmm, I love writing high-fidelity content. Just thinking, “how can I express what I mean as clearly as I can” rather than “how can I simplify what I mean to maximize the fidelity/complexity ratio” is a lot easier for me. But a lot of smart people disagree, and point to how shallow heuristics and layered didactic approaches are essential bridge inferential gaps under time constraints.
So I would like to pose the question to anyone else reading this: If you read “Toward Impact Markets” and you read the above post, do you think we should’ve gone for the same level of fidelity above? Or not? Or something in between?
EA Forum posts can disseminate info hazards that can be extremely harmful. (And this does not seem very unlikely, considering that the ideas that are discussed on the EA Forum are often related to anthropogenic x-risks.)
Excluding whole categories of usually valuable content from contests, though, seems like a very uncommon level of caution. I’m not saying that I *know* that it’s exaggerated caution, but there have been many prize contests for content on the EA Forum, and none of them were so concerned about info hazards. Some of them have had bigger prize pools too. And in addition the EA Forum is moderated, and the moderators probably have a protocol for how to respond to info hazards.
I’ve long pushed for something like the “EA Criticism and Red Teaming” contest (though I usually had more specific spins on the idea in mind), I’m delighted it exists, and I think it’ll be good. But it is a lot more risky than ours. It has a greater prize pool, the most important red-teaming should focus on topics that are important to EA at the moment, so “longtermism” (i.e. “how do we survive the next 20 years”) topics like biosecurity and AI safety, and the whole notion of red-teaming is conceptually close to info hazards too. (E.g., some people claim that some others invoke “info hazard” as a way to silence epistemic threats to their power. I mostly disagree, but my point is about how close the concepts are to each other.)
The original EA Forum Prize referred readers to the About page at the time (note that they, too, opted to put the details on a separate linked page), which explicitly discourages info hazards, rudeness, illegal activities, etc., but spends about a dozen words on fleshing this out precisely as opposed to our 10k+ words. Of course if you can communicate the same thing in a dozen and in 10k+ words, then a dozen is better, but if you think that “non-risky” is not clear about whether it refers to actions that are risky while they’re being performed or only to actions whose results remain risky indefinitely, then “What we discourage (and may delete) … Information hazards that concern us” is also unclear like that. Maybe someone is aware of an info hazard so dangerous that the moment they post it they can see from their own state of existence or nonexistence whether they got lucky or not. I think that both framings clearly discourage such sharing, but regardless, the contests are parallel in this regard. (Or, if anything, ours is safer because we are very, very explicit about the ex ante ceiling in our detailed explainer, with definitions, examples, diagrams, etc.)
But I don’t want to just throw this out there as an argument from authority, “If the EA Forum gods do it, it got to be okay.” It’s just that there is a precedent (over the course of four years or so) for lower levels of caution than ours and nothing terrible happening. That is valuable information for us when we try to make our own trade-off between risks and opportunity costs. (But of course all the badness can be contained in one Black Swan event that is yet to come, so there’s no certainty.)
The original EA Forum Prize does not seem to have had the distribution mismatch problem; the posts were presumably evaluated based on their ex-ante EV (or something like that?).
I don’t know if they were, so either way it was probably also not obvious to some post authors that they’d be judged by ex ante EV, and it’s enough for one of them to only think that they’ll be judged by ex post value to run into the distribution mismatch.
At least to the same extent – whatever it may be – as our contest. Expectational consequentialism seems to me like the norm, though that may be just my bubble, so I would judge both contests to be benign and net positive because I would expect most people to not want to gamble with everyone’s lives, to not think that a contest tries to encourage them to gamble with everyone’s lives, and to not want to just disguise their gamble from the prize committee.
In the original EA Forum Prize, the ex-post EV at the time of evaluation is usually similar to the ex-ante EV assuming that the evaluation happens closely after the post was written. (In a naive impact market, the price of a certificate can be high due to the chance that 3 years from now its ex-post EV will be extremely high.)
So you’re saying it’s fine for them not to make the distinction because they’re so quick that it hardly matters, but that it’s important for us? That makes sense. I suppose that circles back to my earlier comment that I think that our wording is pretty clear about the ex ante nature of the riskiness, but that we can make it even more clear by inserting a few more sentences into the post that make the ex ante part very explicit.
We’ve gone through countless iterations with this announcement post that usually took the shape of one of us drafting something, us then wondering whether it’s too complicated and will cause people to tune out and ignore the contest, and us then trying to greatly shorten and simplify it.
There’s a difficult trade-off between the high-fidelity communication of our long explainer posts and the concision that is necessary to get people to actually read a post when it comes to participating in a contest. Our explainer posts get very little engagement. To participate in the contest it’s not necessary to understand exactly how our mechanisms work, so we hope to reach more people by explaining things in simpler terms without words like “ex ante” and comparisons to constructed counterfactual world histories.
Like, grocery shopping would be a terrible experience if every customer had to understand all the scheduling around harvest, stocks and flows between warehouses, just-in-time delivery, pricing in of some expected number of produce that expire before they’re bought, etc. If anyone who wants to use impact markets has to spend more time up front to learn more about them than the markets are worth to them, that’d be a failure.
This is exacerbated in this case where a submitter has a < 100% chance to get a reward of a few hundred dollars. That comes down to quite little money in expectation, so we’ve been trying hard to make the experience as light on the time commitment as possible while linking our full explainer posts at every turn to make sure that people cannot miss the high-fidelity version if they’re looking for it. Once we have bigger budgets, we can also ask people to engage more upfront with our processes.
That said, we’ve thought a lot about the bolded key sentence “morally good, positive-sum, and non-risky.” We hope that everyone who submits will read it. By “non-risky” we mean “ex ante non-risky.” We hoped that the term captured that as it’s not common to talk about “risks” ex post. Even in sentences like “the Cuban Missile Crisis was risky,” the sentence doesn’t say that the event is a risk for us today after the fact but that, at the time when it was happening, it was risky.
But I’ll ask Dony to go over the post again and see if we can clarify this in a place where it doesn’t cause more confusion than it resolves. Maybe my bolded text below can be inserted below the first sentence that you cited.
For now, let me reiterate for every potential submitter reading this:
We will value impact according to Attributed Impact in its latest version at the time, so if writing your post would’ve been net negative in expectation before you wrote it (ex ante), it cannot be valued positively at any later time! The ex ante expected value is the ceiling of any potential future valuation of the impact, regardless how great it happens to turn out.
Every submitter also has to answer questions like “What positive impact did you expect before you started the project? What were unusually good and unusually bad possible outcomes? (Please avoid hindsight bias and take the interests of all sentient beings into account.)” before we will buy any of the impact. (I should reword that a bit, maybe, “What positive impact was to be expected …,” to make it fit with Attributed Impact.)
Here is a section (verbatim) that I originally wrote for the post that we cut entirely for length:
Downsides
Most of the problems that impact markets might cause are detailed in Toward Impact Markets.
We are particularly concerned with the following:
Issuers might be incentivized to:
try many candidate interventions, some of which might backfire terribly, but then only issue certificates only in the rare interventions that succeeded,
try many candidate interventions, some of which might be terrible for some moral systems, but issue the certificates under different aliases and sell them to different retro funders,
try an intervention many times, usually with disastrous results, but issue an impact certificate only for the rare iteration of the intervention that succeeded,
do something good once but then reframe it slightly to sell the impact from it multiple times to different people on different marketplaces,
compete with other issuers for funding by badmouthing them or withholding resources from them when otherwise they would’ve collaborated,
pander to the perceived preferences of the retro funders even in cases where the issuers have a clearer picture of what is impactful,
generate externalities for individuals that are not themselves represented on the market and who the retro funders are not aware of,
issuers using the markets to issue disguised threats against retro funders.
Investors might be incentivized to:
do little research and just invest large sums into a wide range of projects regardless of whether they’re likely to backfire on the off-chance that (1) one of them actually turns out good or (2) at some point in the future there will be a very rich retro funder that will think that a project turned out good,
invest mostly in things that are highly verifiable to avoid the ambiguity about the purview of certificates that comes with lower levels of verifiability, thereby disadvantaging some interventions for reasons unrelated to their impact,
actively trade certificates to the point of creating a lot of noise that distracts issuers from their object-level work,
do 1.e. and 1.f. above.
Retro funders might:
get scammed by some of the above tricks,
abuse their power by incentivizing projects that are disastrous for some moral systems.
We are optimistic that the brunt of these are solvable in a mature impact market. We don’t have a fully general mechanism but a range of incremental ones. Most of them can be summarized as an attempt to facilitate moral trade on a financial market:
Issuers:
commit to and justify their actions according to an operationalization of impact called Attributed Impact according to which an action that is net negative in ex ante expectation can never be positive in value even if it so happens to turn out well,
can sell only impact in classes of actions that are very unlikely to be extremely harmful, namely articles on the EA Forum (and at a later stage maybe other similar artifacts),
can sell only impact in classes of actions that have passed multiple rounds of vetting – for example in this case because the moderators of the EA Forum allowed the post and because we allowed its certificate to be issued on our platform,
can, conversely, sell impact from exposés of other certificates where the issuers cheated in some fashion to hide negative externalities actual or probabilistic,
con, conversely, sell impact from articles that changes the evaluation of the impact of other certificates,
can, conversely, sell impact from articles detailing new problems of or attack vectors against impact markets.
Investors:
are incentivized by retro funders just enough that those who add information to the market by making good predictions are profitable.
Retro funders:
should commit to Attributed Impact to push issuers and investors to commit to Attributed Impact too, thereby averting negative externalities and threats,
have the option to delegate the decision-making or the prefiltering of funding opportunities to us,
have the option to pivot entirely to retro funding, which should free up so much staff time that they can build expertise in recognizing exploits,
have at some point the support of “the pot,” an investment mechanism that acts as a semi-automated retro funder and reinforces the Schelling point of Attributed Impact.
The remaining problems are mostly related to (1) imperfections in the implementation of these solutions and (2) flaws in the retro funder alignment. If a really generous retro funder joins the market who is unconcerned with moral cooperation or cheating and has enough capital to spend, then impact investors are ready to stay invested in countless projects for decades until the unaligned investor arrives, then that retro funder can have a bad influence on the market even when people merely expect the retro funder to join but when it hasn’t happened.
We don’t think that there is a mechanism that can prevent this from happening because anyone is already free to retroactively reward whoever they like. But we recognize that by writing about impact markets and by running contests like these, we’re making the option more salient.
We want to hit the right balance between minimizing the opportunity costs from delaying the implementation of impact markets and minimizing the direct costs from harm that impact markets might cause. There are those who think that we have an “extreme focus on risks” and those who think that we’re rash for wanting to realize impact markets at all. We would love to get your opinion on where we stand on this balance and how we can improve!
After this contest, it will still be the case that most people will be more likely to read and use instructions that are short and simple. It may be very hard to later “fix” the influence that posts like the OP has on potential future retro funders. Simpler instructions are more prone to become a meme. Therefore, retro funders may predict that some (most?) future retro funders will use the simple “buy likable impact” rule rather than the “adhere to the safety solutions in the Toward Impact Markets post” rule, and thus be incentivized to follow the simple rule themselves. Posts like the OP risk pushing everyone towards the Schelling point of “retro funders buy likable impact”. (All this becomes more worrisome if you or someone else in EA ends up launching a decentralized impact market).
Regarding the claim that “articles on the EA Forum” are “very unlikely to be extremely harmful”: EA Forum posts can disseminate info hazards that can be extremely harmful. (And this does not seem very unlikely, considering that the ideas that are discussed on the EA Forum are often related to anthropogenic x-risks.)
Hmm, I love writing high-fidelity content. Just thinking, “how can I express what I mean as clearly as I can” rather than “how can I simplify what I mean to maximize the fidelity/complexity ratio” is a lot easier for me. But a lot of smart people disagree, and point to how shallow heuristics and layered didactic approaches are essential bridge inferential gaps under time constraints.
So I would like to pose the question to anyone else reading this: If you read “Toward Impact Markets” and you read the above post, do you think we should’ve gone for the same level of fidelity above? Or not? Or something in between?
Excluding whole categories of usually valuable content from contests, though, seems like a very uncommon level of caution. I’m not saying that I *know* that it’s exaggerated caution, but there have been many prize contests for content on the EA Forum, and none of them were so concerned about info hazards. Some of them have had bigger prize pools too. And in addition the EA Forum is moderated, and the moderators probably have a protocol for how to respond to info hazards.
I’ve long pushed for something like the “EA Criticism and Red Teaming” contest (though I usually had more specific spins on the idea in mind), I’m delighted it exists, and I think it’ll be good. But it is a lot more risky than ours. It has a greater prize pool, the most important red-teaming should focus on topics that are important to EA at the moment, so “longtermism” (i.e. “how do we survive the next 20 years”) topics like biosecurity and AI safety, and the whole notion of red-teaming is conceptually close to info hazards too. (E.g., some people claim that some others invoke “info hazard” as a way to silence epistemic threats to their power. I mostly disagree, but my point is about how close the concepts are to each other.)
The original EA Forum Prize referred readers to the About page at the time (note that they, too, opted to put the details on a separate linked page), which explicitly discourages info hazards, rudeness, illegal activities, etc., but spends about a dozen words on fleshing this out precisely as opposed to our 10k+ words. Of course if you can communicate the same thing in a dozen and in 10k+ words, then a dozen is better, but if you think that “non-risky” is not clear about whether it refers to actions that are risky while they’re being performed or only to actions whose results remain risky indefinitely, then “What we discourage (and may delete) … Information hazards that concern us” is also unclear like that. Maybe someone is aware of an info hazard so dangerous that the moment they post it they can see from their own state of existence or nonexistence whether they got lucky or not. I think that both framings clearly discourage such sharing, but regardless, the contests are parallel in this regard. (Or, if anything, ours is safer because we are very, very explicit about the ex ante ceiling in our detailed explainer, with definitions, examples, diagrams, etc.)
But I don’t want to just throw this out there as an argument from authority, “If the EA Forum gods do it, it got to be okay.” It’s just that there is a precedent (over the course of four years or so) for lower levels of caution than ours and nothing terrible happening. That is valuable information for us when we try to make our own trade-off between risks and opportunity costs. (But of course all the badness can be contained in one Black Swan event that is yet to come, so there’s no certainty.)
The original EA Forum Prize does not seem to have had the distribution mismatch problem; the posts were presumably evaluated based on their ex-ante EV (or something like that?).
I don’t know if they were, so either way it was probably also not obvious to some post authors that they’d be judged by ex ante EV, and it’s enough for one of them to only think that they’ll be judged by ex post value to run into the distribution mismatch.
At least to the same extent – whatever it may be – as our contest. Expectational consequentialism seems to me like the norm, though that may be just my bubble, so I would judge both contests to be benign and net positive because I would expect most people to not want to gamble with everyone’s lives, to not think that a contest tries to encourage them to gamble with everyone’s lives, and to not want to just disguise their gamble from the prize committee.
In the original EA Forum Prize, the ex-post EV at the time of evaluation is usually similar to the ex-ante EV assuming that the evaluation happens closely after the post was written. (In a naive impact market, the price of a certificate can be high due to the chance that 3 years from now its ex-post EV will be extremely high.)
So you’re saying it’s fine for them not to make the distinction because they’re so quick that it hardly matters, but that it’s important for us? That makes sense. I suppose that circles back to my earlier comment that I think that our wording is pretty clear about the ex ante nature of the riskiness, but that we can make it even more clear by inserting a few more sentences into the post that make the ex ante part very explicit.