Impact markets may incentivize predictably net-negative projects
Summary
Impact markets (that encourage retrospective funding, and especially if they allow resale of impact) have a severe downside risk: they can incentivize risky projects that are likely to be net-negative due to allowing people to profit if they cause positive impact while not inflicting a cost on them if they cause negative impact. This risk is hard to mitigate.
Impact markets themselves are therefore such a risky project. To avoid the conflict of interest issues that arise, work to establish impact markets should only ever be funded prospectively (never retrospectively).
The risk
Suppose the certificates of a risky project are traded on an impact market. If the project ends up being beneficial, the market allows the people who own the certificates to profit. But if the project ends up being harmful, the market does not inflict a cost on them. The certificates of a project that ended up being extremely harmful are worth as much as the certificates of a project that ended up being neutral, namely nothing. Therefore, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial.[1]
Impact markets can thus incentivize people to create or fund net-negative projects. Denis Drescher used the term “distribution mismatch” to describe this risk, due to the mismatch between the probability distribution of investor profit and that of EV.
It seems especially important to prevent the risk from materializing in the domains of anthropogenic x-risks and meta-EA. Many projects in those domains can cause a lot of accidental harm because, for example, they can draw attention to info hazards, produce harmful outreach campaigns, produce dangerous experiments (e.g. in machine learning or virology), shorten AI timelines, intensify competition dynamics among AI labs, etcetera.
Mitigating the risk is hard
The Toward Impact Markets post describes an approach that attempts to mitigate this risk. The core idea is that retro funders should consider the ex-ante EV rather than the ex-post EV if the former is smaller. (The details are more complicated; a naive implementation of this idea would incentivize people to launch a safe project and later expand it to include high-risk high-reward interventions.)
We think that this approach cannot be relied upon to sufficiently mitigate the risk due to the following reasons:
For that approach to succeed, retro funders must be familiar with it and be sufficiently willing and able to adhere to it. However, some potential retro funders are more likely to use a much simpler approach, such as “you should buy impact that you like”.
Other things being equal, simpler approaches are easier to communicate, more appealing to potential retro funders, more prone to become a meme and a norm, and more likely to be advocated for by teams who work on impact markets and want to get more traction.
If there is no way to prevent anyone from becoming a retro funder, being careful about choosing/training the initial set of retro funders may not help much. Especially if the market allows people to profit from outreach interventions that attract new retro funders who are not very careful.
The price of a certificate tracks the maximum amount of money that any future retro funder will be willing to pay for it. Prudent retro funders do not (significantly) offset the influence of imprudent retro funders on the prices of certificates of net-negative projects.
Traditional (prospective) charitable funding can have a similar dynamic; one only needs one funder to support a project even if everyone else thinks it’s bad. Impact markets make the problem much worse, though, because they add variance from uncertainty about project outcomes as well as variance in funder views.
Suppose that a risky project that is ex-ante net-negative ends up being beneficial. If retro funders attempt to evaluate it after it already ended up being beneficial, hindsight bias can easily cause them to overestimate its ex-ante EV. This phenomenon can make the certificates of net-negative projects more appealing to investors, already at an early stage of the project (before it is known whether the project will end up being beneficial or harmful).
The conflict of interest problem for establishing impact markets
Can we just trust that people interested in establishing impact markets will do so only if it’s a good idea? Unfortunately the incentivization of risky projects applies at this level. If someone establishes an impact market and it has large benefits, they might expect to be able to sell their impact in establishing the markets for large amounts of money. On the other hand if they establish impact markets and they cause large harms, they won’t lose large amounts of money.
Establishing impact markets would probably involve many high-stakes decisions under great uncertainty. (e.g. should an impact market be launched? Should the impact market be decentralized? Should a certain person be invited to serve as a retro funder? Should certain certificates be deleted? What instructions should be communicated to potential market participants?) We should protect the integrity of these decisions by insulating them from conflicts of interest.
This point seems important even conditional on the people involved being the most careful and EA-aligned people in the world. (Because they are still humans, and humans’ judgment is likely to be affected by biases/self-deception when there is a huge financial profit at stake).
Suggestions
Currently, launching impact markets seems to us (non-robustly) net-negative. The following types of impact markets seems especially concerning:
Decentralized impact markets (in which there are no accountable decision makers that can control or shut down the market).
Impact markets that allow certificates for risky interventions, and especially interventions that are related to the impact market itself (e.g. recruiting new retro funders).
On the other hand, we’re excited about work to further understand the benefits and costs of different funding structures. If there were a robust mechanism to allow the markets to avoid the risks discussed in this post (& ideally handle moral trade as well), we think impact markets could have very high potential. We just don’t think we’re there yet.
In any case, launching an impact market should not be done without (weak) consensus among the EA community, in order to avoid the unilateralist’s curse.
To avoid tricky conflicts of interest, work to establish impact markets should only ever be funded in forward-looking ways. Retro funders should commit to not buying impact of work that led to impact markets (at least work before the time when the incentivization of net-negative projects has been robustly cleared up, if it ever is). EA should socially disapprove of anyone who did work on impact markets trying to sell impact of that work.
All of this relates to markets which encourage retrospective funding (especially but not exclusively if they also allow for the resale of impact).
In particular, this is not intended to apply to introducing market-like mechanisms like explicit allocation of credit between contributors to projects. While such mechanisms may be useful for supporting impact markets, they are also useful in their own right (for propagating price information without distorting incentives), and we’re in favour of experiments with such credit allocation.
- ^
The risk was probably first pointed out by Ryan Carey.
- Impact Markets: The Annoying Details by 15 Jul 2022 5:34 UTC; 112 points) (
- Monthly Overload of EA—July 2022 by 1 Jul 2022 16:22 UTC; 55 points) (
- Link Collection: Impact Markets by 26 Dec 2023 9:01 UTC; 27 points) (LessWrong;
- 17 Nov 2022 0:31 UTC; 25 points) 's comment on If Professional Investors Missed This... by (
- A Fresh FAQ on GiveWiki and Impact Markets Generally by 6 Apr 2023 14:02 UTC; 24 points) (
- Crypto loves impact markets: Notes from Schelling Point Bogotá by 22 Oct 2022 15:58 UTC; 22 points) (
- Crypto loves impact markets: Notes from Schelling Point Bogotá by 22 Oct 2022 15:58 UTC; 17 points) (LessWrong;
- 14 Sep 2022 18:27 UTC; 13 points) 's comment on CEA Ops is now EV Ops by (
- Link Collection: Impact Markets by 26 Dec 2023 9:07 UTC; 10 points) (
- 28 Aug 2023 17:24 UTC; 9 points) 's comment on Select examples of adverse selection in longtermist grantmaking by (
- 24 Feb 2023 8:46 UTC; 5 points) 's comment on Manifund Impact Market / Mini-Grants Round On Forecasting by (
- 15 Jul 2022 23:32 UTC; 5 points) 's comment on Impact Markets: The Annoying Details by (
- 25 Jun 2022 22:36 UTC; 5 points) 's comment on Impact markets may incentivize predictably net-negative projects by (
- 20 Nov 2022 22:57 UTC; 5 points) 's comment on Proposal: Funding Diversification for Top Cause Areas by (
- Hits- or misses-based giving by 1 Sep 2022 22:30 UTC; 3 points) (
- 4 Dec 2022 2:36 UTC; 3 points) 's comment on SFF Speculation Grants as an expedited funding source by (
- 15 Jan 2023 11:11 UTC; 2 points) 's comment on Predictable Outcome Payments by (LessWrong;
- 8 Apr 2023 22:48 UTC; 0 points) 's comment on A Fresh FAQ on GiveWiki and Impact Markets Generally by (
- 6 Sep 2022 12:13 UTC; 0 points) 's comment on Impact Shares For Speculative Projects by (LessWrong;
- A Fresh FAQ on GiveWiki and Impact Markets Generally by 6 Apr 2023 14:02 UTC; -1 points) (LessWrong;
I proposed a simple solution to the problem:
For a project to be considered for retroactive funding, participants must post a specific amount of money as collateral.
If a retroactive funder determines that the project was net-negative, they can burn the collateral to punish the people that participated in it. Otherwise, the project receives its collateral back.
This eliminates the “no downside” problem of retroactive funding and makes some net-negative projects unprofitable.
The amount of collateral can be chosen adaptively. Start with a small amount and increase it slowly until the number of net-negative projects is low enough. Note that setting the collateral too high can discourage net-positive but risky projects.
Related: requiring some kind of insurance that pays out when a certificate becomes net-negative.
Suppose we somehow have accurate positive and negative valuations of certificates. We can have insurers sell put options on certificates, and be required to maintain that their portfolio has positive overall impact. (So an insurer needs to buy certificates of positive impact to offset negative impact they’ve taken on.)
Ultimately what’s at stake for the insurer is probably some collateral they’ve put down, so it’s a similar proposal.
Crypto’s inability to take debts or enact substantial punishments beyond slashing stakes is a huge limitation and I would like it if we didn’t have to swallow that (ie, if we could just operate in the real world, with non-anonymous impact traders, who can be held accountable for more assets than they’d be willing to lock in a contract.)
Given enough of that, we would be able to implement this by just having an impact cert that’s implicated in a catastrophe turn into debt/punishment, and we’d be able to make that disincentive a lot more proportional to the scale of its potential negative externalities, and we would be able to allow the market to figure out how big that risk is for itself, which is pretty much the point of an impact market.
Though, on reflection, I’m not sure I would want to let the market to decide that. The problem with markets is that they give us a max function, they’re made of auctions, whoever pays most decides the price, and the views of everyone else are not taken into account at all. Markets, in a sense, subject us to the decisions of the people with the most extreme beliefs. Eventually the ones who are extreme and wrong go bankrupt and disappear, but I don’t find this very reassuring, with rare catastrophic risks, which no market participant can have prior experience of. It’s making me think of the unilateralist’s curse.
So, yeah, maybe we shouldn’t use market processes to price risk of negative externalities.
Nice, that’s pretty interesting. (It’s hacky, but that seems okay.)
It’s easy to see how this works in cases where there’s a single known-in-advance funder that people are aiming to get retro funding from (evaluated in five years, say). Have you thought about whether it could work with a more free market, and not necessarily knowing all of the funders in advance?
This kind of thing could be made more sophisticated by making fines proportional to the harm done, requiring more collateral for riskier projects, or setting up a system to short sell different projects. But simpler seems better, at least initially.
Yeah, that’s a harder case. Some ideas:
People undertaking projects could still post collateral on their own (or pre-commit to accepting a fine under certain conditions). This kind of behavior could be rewarded by retro-funders giving these projects more consideration and the act of posting collateral does constitute a costly signal of quality. But that still requires some pre-commitments from retro funders or a general consensus from the community.
If contributors undertake multiple projects it should be possible to punish after-the-fact by docking some of their rewards from other projects. For example, if someone participates in 1 beneficial project and 1 harmful project, their retro funding rewards from the beneficial project can be reduced due to their participation on the harmful project. Unfortunately, this still requires some sort of pre-commitment from funders.
I was thinking of this. Small funders could then potentially buy insurance from large funders in order to allow them to fund projects that they deem net positive even though there’s a small risk of a fine that would be too costly for them.
I take it that Harsimony is proposing for the IC-seller to put up a flexible amount of collateral when they start their project, according to the possible harms.
There are two problems, though:
This requires centralised prospective estimation of harms for every project. (A big part of the point of impact certificates is to evaluate things retroactively, and to outsource prospective evaluations to the market, thereby incentivising accuracy in the latter.
This penalises IC-sellers based on how big their harms initially seem, rather than how big they eventually turn out to be.
It would be better if the IC-seller is required to buy insurance that will pay out the whole cost of the harm, as evaluated retrospectively. In order for the IC-seller to prove that they are willing to be accountable for all harms, they must buy insurance when they sell their IC. And to ensure that the insurer will pay out correctly, we must only allow insurers who use a standard, trusted board of longtermist evaluators to estimate the harms.
This means that a centralised system is only required to provide occasional retrospective evaluations of harm. The task of evaluating harms in prospect is delegated to insurers, similar to the role insurers play in the real world.
(This is my analysis, but the insurance idea was from Stefan.)
Although, the costs of insurance would need to be priced according to the ex ante costs, not the ex post costs.
For example: Bob embarks on a project with a 50% chance of success. If it succeeds, it saves one person’s life, and Bob sells the IC. If it fails, it kills two people.
Clearly, the insurance needs to be priced to take into account a 50% chance of two deaths. So we would have to require Bob to buy the insurance when he initially embarks on the project (which is a tough ask, given that few currently anticipate selling their impact). Or else we would need to rely on a (centralised) retrospective evaluation of ex ante harm, for every project (which seems laborious).
I love the insurance idea because compared to our previous ideas around shorting with hedge tokens that compound automatically to maintain a −1x leverage, collateral, etc. (see Toward Impact Markets), the insurance idea also has the potential of solving the incentive problems that we face around setting up our network of certificate auditors! (Strong upvotes to both of you!)
(The insurances would function a bit like the insurances in Robin Hanson’s idea for a tort law reform.)
I don’t think that short selling would work. Suppose a net-negative project has a 10% chance to end up being beneficial, in which case its certificates will be worth $1M (and otherwise the certificates will end up being worth $0). Therefore, the certificates are worth today $100K in expectation. If someone shorts the certificates as if they are worth less than that, they will lose money in expectation.
I don’t think such a rule has a chance of surviving if impact markets take off?
Added complexity to the norms for trading needs to pay for itself to withstand friction or else decay to its most intuitive equilibrium.
Or the norm for punishing defectors needs to pay for itself in order to stay in equilibrium.
Or someone needs to pay the cost of punishing defectors out of pocket for altruistic reasons.
Once a collateral-charging market takes off, someone could just start up an exchange that doesn’t demand a collateral, and instead just charge a nominal fee that doesn’t disincentivise risky investments but would still make them money. Traders would defect to this market if it’s more profitable for them.
(To be clear, I think I’m very pro GoodX’s project here; I’m just skeptical of the collateral suggestion.)
Traders would adopt a competitor without negative externality mechanisms, but charities wouldn’t, there will be no end buyers there, I wouldn’t expect that kind of vicious amoral competitive pressure between platforms to play out.
But afaik the theory of change of this project doesn’t rely on altruistic “end buyers”, it relies on profit-motivated speculation? At least, the aim is to make it work even in the worst-case scenario where traders are purely motivated by profit, and still have the trades generate altruistic value. Correct me if I’m wrong,
Update: If it wasn’t clear, I was wrong. :p
My understanding is that without altruistic end-buyers, then the intrinsic value of impact certificates becomes zero and it’s entirely a confidence game.
There might be a market for that sort of ultimately valueless token now (or several months ago? I haven’t been following the NFT stuff), I’m not sure there will be for long.
Aye, I updated. I was kinda dumb. The magical speculation model is probably not worth going for when end-buyers seem within reach.
I think there’s an argument for the thing you were saying, though… Something like… If one marketplace forbids most foundational AI public works, then another marketplace will pop up with a different negative externality estimation process, and it wont go away, and most charities and government funders still aren’t EA and don’t care about undiscounted expected utility, so there’s a very real risk that that marketplace would become the largest one.
I guess there might not be many people who are charitibly inclined, and who could understand, believe in, and adopt impact markets, but also don’t believe in tail risks. There are lots of people who do one of those things, but I’m not sure there are any who do all.
Going Forward
We will convene a regular working group to more proactively iterate and improve the mechanism design focused on risk mitigation. We intend for this group to function for the foreseeable future. Anyone is welcome to join this group via our Discord.
We will attempt to gain consultation from community figures that have expressed interest in impact markets (Paul Christiano, Robin Hanson, Scott Alexander, Eliezer Yudkowsky, Vitalik Buterin). This should move the needle towards more community consensus.
We will continue our current EA Forum contest. We will not run another contest in July.
We will do more outreach to other projects interested in this space (Gitcoin, Protocol Labs, Optimism, etc.) to make sure they are aware of these issues as well and we can come up with solutions together.
Do we think that impact markets are net-negative?
We – the Impact Markets team of Denis, Dony, and Matt – have been active EAs for almost a combined 20 years. In the past years we’ve individually gone through a prioritization process in which we’ve weighed importance, tractability, neglectedness, and personal fit for various projects that are close to the work of QURI, CLR, ACE, REG, CE, and others. (The examples are mostly taken from my, Denis’s, life because I’m drafting this.) We not only found that impact markets were net-positive but have become increasingly convinced (before we started working on them) that they are the most (positively!) impactful thing in expectation that we can do.
We have started our work for impact markets because we found that it was the best thing that we could do. We’ve more or less dedicated our lives to maximizing our altruistic impact – already a decade ago. We were not nerdsniped into it and adjusted our prioritization to fit.
We’re not launching impact certificates to make ourselves personally wealthy. We want to be able to pay the rent, but once we’re financially safe, that’s enough. Some of us have previously moved countries for earning to give.
Why do we think impact markets are so good?
Impact markets reduce the work of funders – if a (hits-based) funder hopes for 10% of their grantees to succeed, then they cut down on the funders’ work by 10x. The funders pay out correspondingly higher rewards which incentivize seed investors to pick up the slack. This pool of seed investors can be orders of magnitude larger than current grant evaluators and would be made up of individuals from different cultures, with different backgrounds, and different networks. They have access to funding opportunities that the funders would not have learned of, they can be confident in these opportunities because they come out of their existing networks, and they can make use of economies of scale if the projects they fund have similar needs. These opportunities can also be more and smaller than opportunities that it would’ve been cost-effective for a generalist funder to evaluate.
Thus impact markets solve the scaling problem of grantmaking. We envision that the result will be an even more vibrant and entrepreneurial EA space that makes maximal use of the available talent and attracts more talent as EA expands.
What do we think about the risks?
The risks are real – we’ve spent June 2021 to March 2022 almost exclusively thinking about the downsides, however remote, to position us well to prevent them. But abandoning the project of impact markets because of the downsides seems about as misguided to us as abandoning self-driving cars because of adversarial-example attacks on street signs.
A wide range of distribution mismatches can already happen due to the classic financial markets. Where an activity is not currently profitable, these don’t work, but there have been prize contests for otherwise nonprofitable outcomes for a long time. We see an impact market as a type of prize contest.
Attributed Impact may look complicated but we’ve just operationalized something that is intuitively obvious to most EAs – expectational consequentialism. (And moral trade and something broadly akin to UDT.) We may sometimes have to explain why it sets bad incentives to fund projects that were net-negative in ex ante expectation to start, but the more sophisticated the funder is, the less likely it is that we need to expound on this. There’s also probably a simple version of the definition that can be easily understood. Something like: “Your impact must be seen as morally good, positive-sum, and non-risky before the action takes place.”
We already can’t prevent anyone from becoming a retro funder. Anyone with money and a sizable Twitter following can reward people for any contributions that they so happen to want to reward them for – be it AI safety papers or how-tos for growing viruses.
Even if we hone Attributed Impact to be perfectly smooth to communicate and improve it to the point where it is very hard to misapply it, that hypothetical person on Twitter can just ignore it. Chances are they’ll never hear of it in the first place.
The previous point applies here too. Anyone on Twitter with some money can already outbid others when it comes to rewarding actions.
An additional observation is that the threshold for people to seed-invest into projects seems to be high. We think that very few investors will put significant money into a project that is not clearly in line with what major retro funders already explicitly profess to want to retro-fund only because there may later be someone who does.
There are already long-running prize contests where the ex ante and the ex post evaluation of the expected impact can deviate. These don’t routinely seem to cause catastrophes. If they are research prizes outside EA, it’s also unlikely that the prize committees will always be sophisticated enough that contenders will trust them to evaluate their projects according to its ex ante impact. Even the misperception that a prize committee would reward a risky project is enough to create an incentive to start the project.
And yet we very much do not want our marketplace to be used for ex ante net-negative activities. We are eager to put safeguards in place above and beyond what any other prize contest in EA has done. As soon as any risks appear to emerge, we are ready to curate the marketplace with an iron fist, to limit the length of resell chains, to cap the value of certificates, to consume the impact we’re buying, and much more.
What are we actually doing?
We are not currently working on a decentralized impact marketplace. (Though various groups in the Ethereum space are, and there is sporadic interest in the EA community as well.)
This is our marketplace. It is a React app hosted on an Afterburst server with a Postgres database. We can pull the plug at any time.
We can hide or delete individual certificates. We’re ready to make certificates hidden by default until we approve them.
You can review the actual submissions that we’ve received to decide how risky the average actual submission is.
We would be happy to form a curation committee and include Ofer and Owen now or when the market grows past the toy EA Forum experiment we have launched so far.
This is our current prize round.
We have allowed submissions that are directly related to impact markets (and received some so that we don’t want to back down from our commitment now), but we’re ready to exclude them in future prize rounds.
We would never submit our own certificates to a prize contest that we are judging, but we’d also be open to not submitting any of our impact market–related work to any other prize contests if that’s what consensus comes to.
An important safety mechanism that we have already started implementing is to reward solutions to problems with impact markets. A general ban on using such rewards would remove this promising mechanism.
We don’t know how weak consensus should be operationalized. Since we’ve already launched the marketplace, it seems to us that we’ve violated this requirement before it was put into place. We would welcome a process by which we can obtain a weak consensus, however measured, before our next prize round.
Miscellaneous notes
Attributed Impact also addresses moral trade.
“A naive implementation of this idea would incentivize people to launch a safe project and later expand it to include high-risk high-reward interventions” – That would have to be a very naive implementation because if the actual project is different from the project certified in the certificate, then the certificate does not describe it. It’s a certificate for a different project that failed to happen.
If the main problem you want to solve is “scaling up grantmaking”, there are probably many other ways how to do it other than “impact markets”.
(Roughly, you can amplify any “expert panel of judges” evaluations with judgemental forecasting.)
We’ve considered a wide range of mechanisms and ended up most optimistic about this one.
When it comes to prediction markets on funding decisions, I’ve thought about this in two contexts in the past:
During the ideation phase, I found that it was already being done (by Metaculus?) and not as helpful because it doesn’t provide seed funding.
In Toward Impact Markets, I describe the “pot” safety mechanism that, I surmised, could be implemented with a set of prediction markets. The implementation that I have in mind that uses prediction markets has important gaps, and I don’t think it’s the right time to set up the pot yet. But the basic idea was to have prediction markets whose payouts are tied to decisions of retro funders to buy a particular certificate. That action resolves the respective market. But the yes votes on the market can only be bought with shares in the respective cert or by people who also hold shares in the respective cert and in proportion to them. (In Toward Impact Markets I favor the product of the value they hold in either as determinant of the payout.)
But maybe you’re thinking of yet another setup: Investors buy yes votes on a prediction market (e.g. Polymarket, with real money) about whether a particular project will be funded. Funders watch those prediction markets and participants are encouraged to pitch their purchases to funders. Funders then resolve the markets with their actual grants and do minimal research, mostly trust the markets. Is that what you envisioned?
I see some weaknesses in that model. I feel like it’s rather a bit over 10x as good as the status quo vs. our model, which I think is over 100x as good. But it is an interesting mechanism that I’ll bear in mind as a fallback!
Does this mean that you (the Impact Markets team) may sell certificates of your work to establish an impact market on that very impact market?
I think the analogy would work better if self-driving cars did risky things that could cause a terrible accident, in order to
prevent the battery from running outreach the destination sooner.I think the following concern (quoted from the OP) is still relevant here:
You later wrote:
Does your current plan not involve explaining to all the retro funders that that they should consider the ex-ante EV as an upper bound?
I don’t see how this argument works. Given that a naive impact market incentivizes people to treat extremely harmful outcomes as if they were neutral (when deciding what projects to do/fund), why should your above argument cause an update towards the view that launching a certain impact market is net-positive? How does the potential harm that other people can cause via Twitter etc. make launching a certain impact market be a better idea than it would otherwise be?
Why? Conditional on impact markets gaining a lot of traction and retro funders spending billions of dollars in impact markets 5 years from now, why wouldn’t it make sense to buy many certificates of risky projects that might end up being extremely beneficial (according to at least one relevant future retro funder)?
Do you intent to allow people to profit from outreach interventions that attract new retro funders? (i.e. by allowing people to sell certificates of such outreach interventions.)
I disagree. I think this risk can easily materialize if the description of the certificate is not very specific (and in particular if it’s about starting an organization, without listing specific interventions.)
First of all, what we’ve summarized as “curation” so far could really be distinguished as follows:
Making access for issuers invite-only, maybe keeping the whole marketplace secret (in combination with #2) until we find someone who produces cool papers/articles and who we trust and then invite them.
Making access for investors/retro funders invite-only, maybe keeping the whole marketplace secret (in combination with #1) until we find an impact investor or a retro funder who we trust and then invite them.
Read every certificate either before or shortly after it is published. (In combination with exposé certificates in case we make a mistake.)
Let’s say #3 is a given. Do you think the marketplace would fulfill your safety requirements if only #1, only #2, or both were added to it?
It involves explaining that. What we wrote was to argue that Attributed Impact is not as complicated as it may sound but rather quite intuitive.
If you want to open a bazaar, one of your worries could be that people will use it to sell stolen goods. Currently these people sell the stolen goods online or on other bazaars, and the experience may be a bit clunky. By default these people will be happy to use your bazaar for their illegal trade because it makes life slightly easier for them. Slightly easier could mean that they get to sell a bit more quickly and create a bit more capacity for more stealing.
But if you enact some security measures to keep them out, you quickly reach the point where the bazaar is less attractive than the alternatives. At that point you already have no effect anymore on how much theft there is going on in the world in aggregate.
So the trick is to tune the security measures just right that they make the place less attractive than alternatives to the thieves and yet don’t impose prohibitively high costs on the legitimate sellers.
My intent so far was to focus on text that is accessible online, e.g., articles, papers, some books. There may be other classes of things that are similarly strong candidates. Outreach seems like a bad fit to me. I’ve so far only considered it once when someone (maybe you) brought it up as something that’d be a bad fit for an impact market and I agreed.
Also very bad fit for an impact market as we envision it. To be a good fit, the object needs to have some cultural rights along the lines of ownership or copyright associated with it so market participants can agree on an owner. It needs to have a start and an end in time. It need so generate a verifiable artifact. Finally, it should not try super hard to try to fit something into that mold that doesn’t fit. There are a bunch of examples in my big post. So a paper, article, book, etc. (a particular version of it) is great. Something ongoing like starting an org is not a good fit. Something where you influence others and most of your impact is leveraging behavior change of others is really awkward because you can’t credibly assign an owner.
An impact market with invite-only access for issuers and investors seems safer than otherwise. But will that be a temporary phase after which our civilization ends up with a decentralized impact market that nobody can control or shut down, and people are incentivized to recruit as many new retro funders as they can? In the Toward Impact Markets post (March 2022) you wrote:
That came after the sentence “A web2 solution like that would have a few advantages too:”, after which you listed three advantages that have nothing to do with safety.
I don’t think the analogy works. Right now, there seems to be no large-scale retroactive funding mechanisms for anthropogenic x-risk interventions. Launching an impact market can change that. An issuer/investor/funder who will use your impact market would probably not use Twitter or anything else to deal with retroactive funding if you did not launch your impact market. The distribution mismatch problem applies to those people. (In your analogy there’s a dichotomy of good people vs. thieves, which has no clear counterpart in the domain of retroactive funding.) Also, if your success inspires others to launch/join competing impact markets, you can end up increasing the number of people who use the other markets.
Hm, naively—is this any different than the risks of net-negative projects in the for-profit startup funding markets? If not, I don’t think this a unique reason to avoid impact markets.
My very rough guess is that impact markets should be at a bare minimum better than the for-profit landscape, which already makes it a worthwhile intervention. People participating as final buyers of impact will at least be looking to do good rather than generate additional profits; it would be very surprising to me if the net impact of that was worse than “the thing that happens in regular markets already”.
Additionally—I think the negative externalities may be addressed with additional impact projects, further funded through other impact markets?
Finally: on a meta level, the amount of risk you’re willing to spend on trying new funding mechanisms with potential downsides should basically be proportional to the amount of risk you see in our society at the moment. Basically, if you think existing funding mechanisms are doing a good job, and we’re likely to get through the hinge of history safely, then new mechanisms are to be avoided and we want to stay the course. (That’s not my current read of our xrisk situation, but would love to be convinced otherwise!)
I think startups are usually doing an activity which scales if it’s good and stops if it’s bad. People can sue if it’s causing harm to them. Overall this kind of feedback mechanism does a fine job.
In the impact markets case I’m most worried about activities which have long-lasting impacts even without continuing/scaling them. I’m more into the possibility of markets for scalable/repeatable activities (seems less fraught).
In general the story for concern here is something like:
At the moment a lot of particularly high-leverage areas are have disproportionate attention from people who are earnestly trying to do good things
Impact markets could shift this to “attention from people earnestly trying to do high-variance things”
In cases where the resolution on what was successful or not takes a long time, and people potentially do a lot of the activity before we know whether it was eventually valued, this seems pretty bad
They refer to Drescher’s post. He writes:
I think this is not quite right. It shouldn’t be about what we think about existing funding mechanisms, but what we think about the course we’re set to be on. I think that ~EA is doing quite a good job of reshaping the funding landscape especially for the highest-priority areas. I certainly think it could be doing better still, and I’m in favour of experiments I expect to see there, but I think that spinning up impact markets right now is more likely to crowd out later better-understood versions than to help them.
I think impact markets should be viewed in that experimental lens, for what it’s worth (it’s barely been tested outside of a few experiments on the Optimism blockchain). I’m not sure if we disagree much!
Curious to hear what experiments and better funding mechanisms you’re excited about~
Impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of “the most dangerous ongoing experiments in virology that we should advocate to stop”; which are interventions that may be net-negative. (Also, in cases where both impact markets and
classicclassical for-profit investors incentivize a project, one can flip your statement and say that there’s no unique reason to launch impact markets; I’m not sure that “uniqueness” is the right thing to look at.)[EDIT: removed unnecessary text.] I tentatively think that launching impact markets seem worse than a “random” change to the world’s trajectory. Conditional on an existential catastrophe occurring, I think there’s a
highsubstantial chance that the catastrophe will be caused by individuals who followed their local financial incentives. We should be cautious about pushing the world (and EA especially) further towards the “big things happen due to individuals following their local financial incentives” dynamics.Thanks for your responses!
Mostly, I meant: the for-profit world already incentivizes people to take high amounts of risk for financial gain. In addition, there are no special mechanisms to prevent for-profit entities from producing large net-negative harms. So asking that some special mechanism be introduced for impact-focused entities is an isolated demand for rigor.
There are mechanisms like pollution regulation, labor laws, etc which apply to for-profit entities—but these would apply equally to impact-focused entities too.
I think I disagree with this? I think people following local financial incentives is always going to happen, and the point of an impact market is to structure financial incentives to be aligned with what the EA community broadly thinks is good.
Agree that xrisk/catastrophe can happen via eg AI researchers following local financial incentives to make a lot of money—but unless your proposal is to overhaul the capitalist market system somehow, I think building a better competing alternative is the correct path forward.
It may be useful to think about it this way: Suppose an impact market is launched (without any safety mechanisms) and $10M of EA funding are pledged to be used for buying certificates as final buyers 5 years from now. No other final buyers join the market. The creation of the market causes some set of projects X to be funded and some other set of project Y to not get funded (due to the opportunity cost of those $10M). We should ask: is [the EV of X minus the EV of Y] positive or negative? I tentatively think it’s negative. The projects in Y would have been judged by the funder to have positive ex-ante EV, while the projects in X got funded because they had a chance to end up having a high ex-post EV.
Also, I think complex cluelessness is a common phenomenon in the realms of anthropogenic x-risks and meta-EA. It seems that interventions that have a substantial chance to prevent existential catastrophes usually have an EV that is much closer to 0 than we would otherwise think due to also having a chance to cause an existential catastrophe.
Therefore, the EV of Y seems much closer to 0 than the EV of X (assuming that the EV of X is not 0).[EDIT: adding the text below.]
Sorry, I messed up when writing this comment (I wrote it at 03:00 am...). Firstly, I confused X and Y in the sentence that I now crossed out. But more fundamentally: I tentatively think that the EV of X is negative (rather than positive but smaller than the EV of Y), because the projects in X are ones that no funder in EA decides to fund (in a world without impact markets). Therefore, letting an impact market fund a project in X seems even worse than falling into the regular unilateralist’s curse, because here there need not be even a single person who thinks that the project is (ex-ante) a good idea.
I messed up when writing that comment (see the EDIT block).
I didn’t follow this; could you elaborate? (/give an example?)
This reminds me a lot of limited liability (see also Austin’s comment, where he compares it to the for-profit startup market, which because of limited liability for corporations bounds prices below by 0).
This is a historically unusual policy (full liability came first), and seems to me to have basically the same downsides (people do risky things, profiting if they win and walking away if they lose), and basically the same upsides (according to the theory supporting LLCs, there’s too little investment and support of novel projects).
Can you say more about why you think this consideration is sufficient to be net negative? (I notice your post seems very ‘do-no-harm’ to me instead of ‘here are the positive and negative effects, and we think the negative effects are larger’, I’m also interested in Owen’s impression on whether or not impact markets lead to more or less phase 2 work.)
Can you explain the “same upsides” part?
I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it’s very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). The EA community causes activities in anthropogenic x-risk domains, and it’s extremely important that it will differentially cause net-positive activities. This is something we should optimize for rather than regard as an axiom. Therefore, we should be very wary of funding mechanisms that incentivize people to treat extremely harmful outcomes as if they were neutral (when making decisions about doing/funding projects that are related to anthropogenic x-risks).
[EDIT: Also, interventions that are carried out if and only if impact markets fund them seem
selected for beingmore likely than otherwise to be net-negative, because they are ones that no classical EA funder would fund.]Yeah; by default people have entangled assets which will be put at risk by starting or investing in a new project. Limiting the liability that originates from that project to just the assets held by that project means that investors and founders can do things that seem to have positive return on their own, rather than ‘positive return given that you’re putting all of your other assets at stake.’
[Like I agree that there’s issues where the social benefit of actions and the private benefits of actions don’t line up, and we should try to line them up as well as we can in order to incentivize the best action. I’m just noting that the standard guess for businesses is “we should try to decrease the private risk of starting new businesses”; I could buy that it’s different for the x-risk environment, where we should not try to decrease the private risk of starting new risk reduction projects, but it’s not obviously the case.]
Sure, I agree with this, and with the sense that the costs are large. The thing I’m looking for is the comparison between the benefits and the costs; are the costs larger?
Sure, I buy that adverse selection can make things worse; my guess was that the hope was that classical EA funders would also operate thru the market. [Like, at some point your private markets become big enough that they become public markets, and I think we have solid reasons to believe a market mechanism can outperform specific experts, if there’s enough profit at stake to attract substantial trading effort.]
Efficient impact markets would allow anyone to create certificates for a project and then sell them for a price that corresponds to a very good prediction of their expected future value. Therefore, sufficiently efficient impact markets will probably fund some high EV projects that wouldn’t otherwise be funded (because it’s not easy for classical EA funders to evaluate them or even find them in the space of possible projects). If we look at that set of projects in isolation, we can regard it as the main upside of creating the impact market. The problem is that the market does not reliably distinguish between those high EV projects and net-negative projects, because a potential outcome that is extremely harmful affect the expected future value of the certificate as if the outcome were neutral.
Suppose x is a “random” project that has a substantial chance to prevent an existential catastrophe. If you believe that the EV of x is much smaller than the EV of x conditional on x not causing a harmful outcome, then you should be very skeptical about impact markets. Finally, we should consider that if a project is funded if and only if impact markets exist then it means that no classical EA funder would fund it in a world without impact markets, and thus it seems more likely than otherwise to be net-negative.
(Even if all EA funders switched to operate solely as retro funders in impact markets, I think it would still be true that an intervention that gets funded by an impact market—and wouldn’t get funded in a world without impact markets—seems more likely than otherwise to be net-negative.)
Concretely:
If someone thinks a net-negative project is being traded on (or run at all), how about posting about it on the forum?
I assume anyone who retro funds a project will first run a search here and see what came up.
Meta:
The problem of funding net-negative projects exists also now.
I think that if retroactive public good’s funding existed, people would be slightly horrified about canceling it. The upside seems big.
Nobody has seriously run retroactive public good’s funding yet, we’re not sure how it will go or what will go wrong. Surely some things will go wrong, I agree about that.
This is a kind of project that we can stop or change if we want to. There is a lot of human discretion. This is not like adding a government regulation that will be very hard to change, or launching a blockchain that you can pretty much never take back no matter what you do.
So I suggest we do experiment with it, fund some good projects (like hpmorpodcast!), and do a few things like encourage people to post on the forum if they think a project is net-negative and being traded. Perhaps even the software system that manages the trade can link to a forum post for each project, so people can voice their opinions there
As we wrote in the post, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial. For example, consider OpenAI (I’m not making here a claim that OpenAI is net-negative, but it seems that many people in EA think it is, and for the sake of this example let’s imagine that everyone in EA think that). It’s plausible that OpenAI will end up being extremely beneficial. Therefore, if a naive impact market had existed when OpenAI was created, it’s likely that the market would have helped in funding its creation (i.e. OpneAI’s certificates would have been traded for a high price).
Also, it seems that people in EA (and in academia/industry in general) usually avoid saying bad things publicly about others’ work (due to reasons that are hard to nullify). Another point to consider is that saying that a project is net-negative publicly can sometimes in itself be net-negative due to drawing attention to info hazards. (e.g. “The experiment that Alice is working on is dangerous!”)
As I already wrote in a reply to Austin, impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of “the most dangerous ongoing experiments in virology that we should advocate to stop”; which are interventions that may be net-negative. (Also, in cases where both impact markets and existing mechanisms incentivize a project, one can flip your argument and say that the solution to funding net-positive projects already exist and so we don’t need impact markets. To be clear, I’m not making that argument, I’m just trying to show that the original one is wrong.)
Shutting down an impact market, if successful, functionally means burning all the certificates that are owned by the market participants, who may have already spent a lot of resources and time in the hope to profit from selling their certificates in the future. Obviously, that may not be an easy action for the decision makers to take. Also, if the decision makers have conflicts of interest with respect to shutting down the market, things are even more problematic (which is an important topic that is discussed in the post.) [EDIT: Also, my understanding is that there was (and perhaps still is) an intention to launch a decentralized impact market (i.e. Web3 based), which can be impossible to shut down.]
OpenAI
I recommend that early on someone posts:
“OpenAI is expected to be negative on net even if it ends up randomly working [...] So retro funders, if you agree with this, please don’t retro-fund OpenAI, even if it works”
I expect this will reduce the price at which OpenAI is traded
People in EA (and in academia/industry in general) usually avoid saying bad things publicly about others’ work
Yeah. Anonymous comments? Messaging FTX (or who ever does the retro funding) ? This seems like the kind of thing we’d learn on the fly, no?
Or an item on the checklist before retro-funding?
This problem already exists today when you fund someone for work that they’re planning to do, no?
launching an aggressive outreach campaign to make people join EA
I bet EA has this problem today already and CEA deals with it somehow. Wanna ask them?
(In other words, doesn’t seem like a new probelm. No?)
Publishing a list of “the most dangerous ongoing experiments in virology that we should advocate to stop”
I assume CEA+LW already remove such infohazards from the forum(s) and would reach out to EAs who publish this stuff elsewhere, if it comes to their attention.
= probably not a new problem.
And you could probably ask CEA/LW what they think since it’s their domain (?) They might say this is totally a huge problem and making it worse is really bad, I don’t know
One can flip your argument and say that the solution to funding net-positive projects already exist and so we don’t need impact markets
Nice! (I like this kind of argument <3 )
I missed what you’re replying to though. Is it the “The problem of funding net-negative projects exists also now.” ?
I’m pointing at “we’re not creating any new problem [that we have no mechanism to solve]”. (I’ll wait for your reply here since I suspect I missed your point)
Human discretion / Shutting down an impact market
What I actually mean is not that the people running the market will shut it down.
What I mean is that the retro funders can decide which projects to fund or not to fund and they have a ton of flexibility around this. (I can say more)
But an impact market can still make OpenAI’s certificates be worth $100M if, for example, investors have at least 10% credence in some future retro funder being willing to buy them for $1B (+interest). And that could be true even if everyone today believed that creating OpenAI is net-negative. See the “Mitigating the risk is hard” section in the OP for some additional reasons to be skeptical about such an approach.
Yes. You respond to examples of potential harm that impact markets can cause by pointing out that these things can happen even without impact markets. I don’t see why these arguments should be more convincing than the flipped argument: “everything that impact markets can fund can already be funded in other ways, so we don’t need impact markets”. (Again, I’m not saying that the flipped argument makes sense.)
Your overall view seems to be something like: we should just create an impact market and if it causes harm then the retro funders will notice and stop buying certificates (or they will stop buying some particular certificates that are net-negative to buy). I disagree with this view because:
There is a dire lack of feedback signal in the realm of x-risk mitigation. It’s usually very hard to judge whether a given intervention was net-positive or net-negative. It’s not just a matter of asking CEA / LW / anyone else what they think about a particular intervention, because usually no one on Earth can do a reliable, robust evaluation. (e.g. is the creation of OpenAI/Anthropic net positive or net negative?) So, if you buy the core argument in the OP (about how naive impact markets incentivize people to carry out interventions without considering potential outcomes that are extremely harmful), I think that you shouldn’t create an impact market and rely on some unspecified future feedback signals to make retro funders stop buying certificates in a net-negative way at some unspecified point in the future.
As I argued in the grandparent comment, we should expect the things that people in EA say about the impact of others in EA to be positively biased.
All the above assumes that by “retro funders” here you mean a set of carefully appointed Final Buyers. If instead we’re talking about an impact market where anyone can become a retro funder, and retro funders can resell their impact to arbitrary future retro funders, I think things would go worse in expectation (see the first three points in the section “Mitigating the risk is hard” in the OP).
It could be done a bit more smoothly by (1) accepting no new issues, (2) completing all running prize rounds, and (3) declaring the impact certificates not burned and allowing people some time to export their data. (I don’t think it would be credible for the marketplace to declare the certs burned since it doesn’t own them.)
My original idea from summer 2021 was to use blockchain technology simply for technical ease of implementation (I wouldn’t have had to write any code). That would’ve made the certs random tokens among millions of others on the blockchain. But then to set up a centralized, curated marketplace for them with a smart and EA curation team.
We’ve moved away from that idea. Our current market is fully web2 with no bit of blockchain anywhere. Safety was a core reason for the update. (But the ease-of-implementation reasons to prefer blockchain also didn’t apply so much anymore. We have a doc somewhere with all the pros and cons.)
For our favored auction mechanisms, it would be handy to be able to split transactions easily, so we have thought about (maybe, at some point) allowing users to connect a wallet to improve the user experience, but that would be only for sending and receiving payments. The certs would still be rows in a Postgres database in this hypothetical model. Sort of like how Rethink Priorities accepts crypto donations or a bit like a centralized crypto exchange (but that sounds a bit pompous).
But what do you think about the original idea? I don’t think it’s so different from a fully centralized solution where you allow people to export their data or at least not prevent them from copy-pasting their certs and ledgers to back them up.
My greatest worries about crypto stem less from the technology itself (which, for all I know, could be made safe) but from the general spirit in the community that decentralization, democratization, ungatedness, etc. are highly desirable values to strive for. I don’t want to have to fight against the dominant paradigms, so that doing it on my server was more convenient. But then again big players in the Ethereum space have implemented very much expert-run systems with no permissionless governance tokens and such. So I hope (and think) that there are groups that can be convinced that an impact market should be gated and curated by trusted experts only.
But even so, a solution that is crypto-based beyond making payments easier is something that I consider more in the context of joining existing efforts to make them safer rather than actions that would influence their existence.
That could make it easier for another team to create a new impact market that will seamlessly replace the impact market that is being shut down.
If a decentralized impact market gains a lot of traction, I don’t see how the certificates being “tokens among millions of others” helps. A particular curated gallery can end up being ignored by some/most market participants (and perhaps be outcompeted by another, less scrupulous curated gallery).
Okay, but to keep the two points separate:
Allowing people to make backups: You’d rather make it as hard as possible to make backups, e.g., by using anti-screenscraping tools and maybe hiding some information about the ledger in the first place so people can’t easily back it up.
Web3: Seems about as bad as any web2 solution that allows people to easily back up their data.
Is that about right?
I think that a decentralized impact market that can’t be controlled or shut down seems worse. Also, a Web3 platform will make it less effortful for someone to launch a competing platform (either with or without the certificates from the original platform).
I should mention that Good Exchange/impact certs people have discussed this quite a bit. I raised concerns about this issue early on here. Shortly later, I posted the question would (myopic) general public good producers significantly accelerate the development of AGI? to Lesswrong.
My current thoughts are similar to harsimony’s, it’s probably possible to get the potential negative externalities of a job to factor into the price of the impact cert by having certs take on negative value/turn into liabilities/debts if the negative outcomes end up eventuating.
We don’t know exactly how to implement that well yet, though.
I think that retroactive Impact Markets may be a net negative for many x-risk projects, however, I also think that in general Impact Markets may significantly reduce x-risk.
I think you have to bear in mind that if this project is highly successful, it has the potential to create a revolution in the funding of public goods. If humanity achieves much better funding and incentive mechanisms for public goods, this could create a massive increase in the efficiency of philanthropy.
It is hard to know how beneficial such a system would be, but it is not hard to see how mutilpying the effectiveness of philanthropy and public good provision could make society function much better by improving education, coordination, mental health, moral development, health, etc., and increases in these public goods could broadly improve humanity’s ability to confront many x-risks.
I think it may make sense for Impact Markets to find ways of limiting or excluding x-risk projects, but I think abandoning Impact Markets altogether would be a mistake, and considering their massive upsides I cannot say I agree that they are net negative in expectation, even without excluding x-risk projects.
Dawn’s (Denis’s) Intellectual Turing Test Red-Teaming Impact Markets
[Edit: Before you read this, note that I failed. See the comment below.]
I want to check how well I understand Ofer’s position against impact markets. The “Imagined Ofer” below is how I imagine Ofer to respond (minus language – I’m not trying to imitate his writing style though our styles seem similar to me). I would like to ask the real Ofer to correct me wherever I’m misunderstanding his true position.
I currently favor using the language of prize contests to explain impact markets unless I talk to someone intimately familiar with for-profit startups. People seem to understand it more easily that way.
My model of Ofer is informed by (at least) these posts/comment threads.
Dawn: I’m doing these prize contests now where I encourage people to help each other (monetarily and otherwise) to produce awesome work to reduce x-risks and finally I reward all participants in the best ones of the projects. I’m writing software to facilitate this. I will only reward them in proportion to the gains from moral trade that they’ve generated, and I’ll use my estimate of their ex ante EV as a ceiling for my overall evaluation of a project.
This has all sorts of benefits! It’s basically a wide-open regrantor program where the quasi-regrantors (the investors) absorb most of the risk. It scales grantmaking up and down – grantmakers have ~ 10x less work and can thus scale their operation up by 10x, and the investors can be anyone around the world, so they can draw on their existing networks for their investments, so they can consider many more much smaller investments or investments which require very niche knowledge or access. Many more ideas will get tried, and it’ll be easier for people to start projects even when they still lack personal contact to the right grantmakers.
Imagined Ofer: That seems very dangerous to me. What if someone else also offers a reward and also encourages people to help each other with the projects but does not apply your complicated ex ante EV ceiling? Someone may create a flashy but extremely risky project and attract a lot of investors for it.
Dawn: But they can do that already? All sorts of science prizes, all the other EA-related prizes, Bountied Rationality, new prizes they promote on Twitter, etc.
Imagined Ofer: Okay, but you’re building a software to make it easier, so presumably you’ll thereby increase the number of people who will offer such prizes and increase the number of people who will attract investments in advance because the user experience and networking with investors is smoother and because they’re encouraged to do so.
Dawn: That’s true. We should make our software relatively unattractive to such prize offerers and their audiences, for example by curating the projects on it such that only the ones that are deemed to be robustly positive in impact are displayed (something I proposed from the start, in Aug. 2021). I could put together a team of experts for this.
Imagined Ofer: That’s not enough. What if you or your panel of experts overlook that a project was actually ex ante net-negative in EV, for example because it has already matured and so happened to turn out good? You’d be biased in a predictably upward direction in your assessment of the ex ante EV. In fact, people could do a lot of risky projects and then only ever submit the ones that worked out fine.
Dawn: Well, we can try really hard… Pay bounties for spotting projects that were negative in ex ante EV but slipped through; set up a network of auditors; make it really easy and effortless to hold compounding short positions on projects that manage their −1x leverage automatically; recruit firm like Hindenburg Research (or individuals with similar missions) to short projects and publish exposés on them; require issuers to post collateral; set up a mechanisms whereby it becomes unlikely that there’ll be other prizes with any but a small market share (such as the “pot”); maybe even require preregistration of projects to avoid the tricks you mention; etc. (All the various fixes I propose in Toward Impact Markets.)
Imagined Ofer: Those are only unreliable patches for a big fundamental problem. None of them is going to be enough, not even in combination. They are slow and incomplete. Ex ante negative projects can slip through the cracks or remain undetected for long enough to cause harm in this world or a likely counterfactual world.
Dawn: Okay, so one slips through, attracts a lot of investment, gets big, maybe even manages to fool us into awarding it prize money. It or new projects in the same reference class have some positive per-year probability of being found out due to all the safety mechanisms. Eventually a short-seller or an exposé-bounty poster will spot them and make a lot of money for doing so. We will react and make it super-duper clear going forward that we will not reward projects in that reference class ever again. Anyone who wants to get investments will need to make the case that their project is not in that reference class.
Imagined Ofer: But by that time the harm is done, be it to a counterfactual world. Next time the harm will be done to the factual world. Besides, regardless of how safe you actually make the system, what’s important is that there can always be issuers and investors who believe (be it wrongly believe) that they can get their risky project retro-funded. You can’t prevent that no matter how safe you make the system.
Dawn: But that seems overly risk averse to me because prospective funders can also make mistakes, and current prizes – including prizes in EA – are nowhere near as safe. Once our system is safer than any other existing methods, the bad actors will prefer the existing methods.
Imagined Ofer: The existing methods are much safer. Prospective funding is as safe as it gets, and current prizes have a time window of months or so, so by the time the prizes are awarded, the projects that they are awarded to are still very young, so the prizes are awarded on the basis of something that is still very close to ex ante EV.
Dawn: But retroactive funders can decide when to award prizes. In fact, we have gone with a month in our experiment. But admittedly, in the end I imagine that cycles of a year or two are more realistic. That is still not that much more. (See this draft FAQ for some calculations. Retro funders will pay out prizes of up to 1000% in the success case, but outside the success case investors will lose all or most of their principal. They are hits-based investors, so their riskless benchmark profit is probably much higher than 5% per year. They’ll probably not want to stay in certificates for more than a few years even at 1000% return in the success case.)
Imagined Ofer: A lot more can happen in a year or two than in a month. EA, for example, looked very different in 2013 compared to 2015, but it looked about the same in January vs. February 2015. But more importantly, you write about tying the windfall clauses of AGI companies to retro funding with enormous budgets, budgets that surely offset even the 20 years that it may take to get to that point and the very low probability.
Dawn: The plan I wrote about has these windfalls reward projects that were previously already rewarded by our regular retro funders, no more.
Imagined Ofer: But what keeps a random, unaligned AGI company from just using the mechanism to reward anyone they like?
Dawn: True. Nothing. Let’s keep this idea private. I can unpublish my EA Forum post too, but maybe that’s the audience that should know about it if anyone should. As an additional safeguard against uncontrolled speculation, how about we require people to always select one or several actual present prize rounds when they submit a project?
Imagined Ofer: That might help, but people could just churn out small projects and select whatever prize happens to be offered at the time whereas in actuality they’re hoping that one of these prizes will eventually be mentioned in a windfall clause or their project will otherwise be retro funded through a windfall clause or some other future funder who ignore the setting.
Dawn: Okay, but consider how far down the rabbit hole we’ve gone now: We have a platform that is moderated; we have relatively short cycles for the prize contest (currently just one month); we explicitly offer prizes for exposés; we limit our prizes to stuff that is, by dint of its format, unlikely to be very harmful; we even started with EA Forum posts, a forum that has another highly qualified moderation team. Further, we want to institute more mechanisms – besides exposés – that make it easy to short certificates to encourage people to red-team them; mechanisms to retain control of the market norms even if many new retro funders enter; even stricter moderation; etc. We’re even considering requiring preregistration, mandatory selection of present prize rounds (even though it runs counter to how I feel impact markets should work), and very narrow targets set by retro funders (like my list of research questions in our present contest). Compare that to other EA prize contests. Meanwhile, the status quo is that anyone with some money and Twitter following can do a prize contest, and anyone can make a contract with a rich friend to secure a seed investment that they’ll repay if they win. All of our countless safeguards should make it vastly easier for unaligned retro funders and unaligned project founders to do anything other than use our platform. All that remains is that maybe we’re spreading the meme that you can seed-invest into potential prize winners, but that’s also something that is already happening around the world with countless science prizes. What more can we do!
Imagined Ofer: This is not an accusation – we’re all human – but money and sunk time-cost fallacy corrupt. For all I know this could be a motte-and-bailey type of situation: The moment a big crypto funder offers you a $1m grant, you might throw caution to the wind and write a wide-open ungated blockchain implementation of an impact market.
Dawn: I hope I’ve made clear in my 20,000+ words of writing on impact market safety that were unprompted by your comments (other than the first one in 2021) that my personal prioritization has long rested on robustness over mere positive EV. I’ve just quit my well-paid ETG job as software engineer in Switzerland to work on this. If I were in it for the money, I wouldn’t be. (More than what I need for my financial safety.) Our organization is also set up with a very general purview so that we can pivot easily. So if I should start work on a more open version of the currently fully moderated, centralized implementation, it’s because I’ve come to believe that it’s more robustly positive than I currently think it is. (Or it may well be possible to find a synthesis of permissionlessness and curation.) The only thing that can convince me otherwise are evidence and arguments.
Imagined Ofer: I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it’s very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). So the typical EA Forum post with sufficient leverage over our future to make a difference at all is about equally likely to increase or to decrease x-risk.
Dawn: I find that to be an unusual opinion. CEA and others try to encourage people to post on the EA Forum rather than discourage them. That was also the point of the CEA-run EA Forum contest. Personally, I also find it unintuitive that that should be the case: For any given post, I try to think of pathways along which it could be beneficial and detrimental. Usually there are few detrimental pathways, and if there are any, there are strong social norms around malice and government institutions such as the police in the way of pursuing the paths. A few posts come to mind that are rare, unusual exceptions to this theme, but it’s been several years since I read one of those. Complex cluelessness also doesn’t seem to make a difference here because it applies equally to any prospective funding, to prizes after one month, and to prizes after one year. Do you think that writing on high-leverage topics such as x-risks should generally be discouraged rather than encouraged on the EA Forum?
Imagined Ofer: Even if you create even a very controlled impact market that is safer than the average EA prize contest you are still creating a culture and a meme regarding retroactive funding. You could inspire someone to post on Twitter “The current impact markets are too curated. I’m offering a $10m retro prize for dumping 500 tons of iron sulfate into the ocean to solve climate change.” If someone posted this now no one would take them seriously. If you create an impact market with tens of millions of dollars flowing through it and many market actors, it will become believable to some rouge players that this payout is likely real.
I do not endorse the text written by “Imagined Ofer” here. Rather than describing all the differences between that text and what I would really say, I’ve now published this reply to your first comment.
Seems a case of “How x-risk projects are different from startups”
This post seems like a really great example in the wild of how distribution mismatch could occur.
Another comment, specifically about AGI capabilities:
If someone wants to advance AI capabilities, they can already get prospective funding by opening a regular for-profit startup.
No?
Right. But without an impact market it can be impossible to profit from, say, publishing a post with a potentially transformative insight about AGI development. (See this post as a probably-harmless-version of the type of posts I’m talking about here.)
I acknowledge this could be bad, but (as with most of my comments here), this is not a new problem.
Also today, if someone publishes such a post in the Alignment Forum: I hope they have moderation for taking it down, wether the author expects to make money from it or not.
Or is your worry something like “there will be 10x more such posts and the moderation will be overloaded”?
It’s just an example for how a post on the alignment forum can be net-negative and how it can be very hard to judge whether it’s net-negative. For any net-negative intervention that impact markets would incentivize, if people can do it without funding then the incentive to do impressive things can also cause them to carry out the intervention. In those cases, impact markets can cause those interventions to be more likely to be carried out.
I hope I’m not strawmanning your claim and please call me out if I am,
but,
Seems like you are arguing for making it more likely to have [a risk] that, as you point out, happened, and the AF could solve with almost no cost, and they chose not to.
..right?
So.. why do you think it’s a big problem?
Or at least.. seems like the AF disagrees about this being a problem.. no?
(Please say if this is an unfair question somehow)
(Not an important point [EDIT: meaning the text you are reading in these parentheses], but I don’t think that a karma of 18 points is a proof for that; maybe the people who took the time to go over that post and vote are mostly amateurs who found the topic interesting. Also, as an aside, if someone one day publishes a brilliant insight about how to develop AGI much faster, taking the post down can be net-negative due to the Streisand effect).
I’m confident that almost all the alignment researchers on Earth will agree with the following statement: conditional on such a post having a transformative impact, it is plausible [EDIT: >10% credence] that the post will
end up havinghave an extremely harmful impact. [EDIT: “transformative impact” here means impact that is either extremely negative or extremely positive.] I argue that we should be very skeptical about potential funding mechanisms that incentivize people to treat “extremely harmful impact” here as if it were “neutral impact”. A naive impact market is such a funding mechanism.You changed my mind!
I think the missing part, for me, is a public post saying “this is what I’m going to do, but I didn’t start”, which is what the prospective funder sees, and would let the retro funder say “hey you shouldn’t have funded this plan”.
I think.
I’ll think about it
I think you’re missing the part where if such a marketplace was materially changing the incentives and behavior of the Alignment Forum, people could get an impact certificate for counterbalancing externalities such as critiquing/flagging/moderating a harmful AGI capabilities post, possibly motivating them to curate more than a small moderation team could handle.
That’s not to say that in that equilibrium there couldn’t be an even stronger force of distributionally mismatched positivity bias, e.g. upvote-brigading assuming there are some Goodhart incentives to retro fund posts in proportion to their karma, but it is at least strongly suggestive.
Ofer (and Owen), I want to understand and summarize your cruxes one by one, in order to sufficiently pass your Ideological Turing Test that I can regenerate the core of your perspective. Consider me your point person for communications.
Crux: Distribution Mismatch of Impact Markets & Anthropogenic X-Risk
If I understand one of the biggest planks of your perspective correctly, you believe that there is a high-variance normal distribution of utility centered around 0 for x-risk projects, such that x-risk projects can often increase x-risk rather than decrease it. I have been concerned for a while that the x-risk movement may be bad for x-risk, so I am quite sympathetic to this claim, though I do believe some significant fraction of potential x-risk projects approach being robustly good. That said I think we are basically in agreement that a large subset of potential mathematically realisable x-risk projects actually increase it, though it’s harder to be sure about the share of it in-practice with real x-risk projects given that people generally if not totally avoid the obviously bad stuff.
The examples you are most concerned by in particular are biosecurity and AI safety (as mentioned in a previous comment of yours), due to potential infohazards of posts on the EA Forum, as well as meta EA mentioned above. You have therefore suggested that impact markets should not deal with these causes, either early on such as during our contest or presumably indefinitely.
Let me use at least one example set of particular submissions that may fall under these topics, and let me know what you think of them.
I was thinking it would be quite cool if both Yudkowsky and Christiano respectively submitted certificates for their posts, ‘List of Lethalities’ and ‘Where I agree and disagree with Eliezer’. These are valuable posts in my opinion and they would help grow an impact marketplace.
My model of you would say either that:
1) funding those particular posts is net bad, or
2) funding those two posts in particular may be net good, but it sets a precedent that will cause there to be further counterfactual AI safety posts on EA Forum due to retroactive funding, which is net bad, or
3) posts on the EA Forum/LW/Alignment Forum being further incentivized would be net good (minus stuff such as infohazards, etc), but a more mature impact market at scale risks funding the next OpenAI or other such capabilities project, therefore it’s not worth retroactively funding forum posts if it risks causing that.
I am tentatively guessing your view is something at least subtly different from those rough disjunctions, though not too different.
Looking at our current submissions empirically, my sense is that the potentially riskiest certificate we have received is ‘The future of nuclear war’ by Alexei Turchin. The speculation in it could potentially provide new ideas to bad actors. I don’t know, I haven’t read/thought about this one in detail yet. For instance, core degasation could be a new x-risk but it also seems highly unlikely. This certificate could also be the most valuable. My model of you says this certificate is net-negative. I would agree that it may be an example of the sort of situation where some people believe a project is a positive externality and some believe it’s a negative externality, but the mismatch distribution means it’s valuated positively by a marketplace that can observe the presence of information but not its absence. Or maybe the market thinks riskier stuff may win the confidence game. ‘Variance is sexy’. This is a very provisional thought and not anything I would clearly endorse; I respect Alexei’s work quite highly!
After your commentary saying it would be good to ban these topics, I was considering conceding that condition because it doesn’t seem too problematic to do so for the contest, and by and large I still think that, though again I would also specifically quite like to see those two AI posts submitted if the authors want that.
I’m curious to know your evaluation of the following possible courses of action, particularly by what percentage your concern is reduced vs other issues:
impact markets are isolated from x-risk topics for all time using magic, they are not isolated from funding meta EA which could downstream affect x-risk
impact markets are isolated from x-risk topics and from funding meta EA for all time using magic, they only fund object-level stuff such as global health and development
we don’t involve x-risk topics in our marketplace for the rest of the year
we don’t involve x-risk topics until there is a clear counterbalancing force to mismatch distribution in the mechanism design, in a way that can be mathematically modelled, which may be necessary if not sufficient for proving the mechanism design works
or you or agents you designate are satisfied that a set of informal processes, norms, curation processes, etc. are achieving this for a centralized marketplace
though I would predict this does not address your crux that a centralized impact market may inspire / devolve into a simpler set of equilibria of retro funding that doesn’t use e.g. Attributed Impact, probably in conjunction with decentralization
I can start a comment thread for discussing this crux separately
we do one of the above but allow the two AI posts as exceptions
That list is just a rough mapping of potential actions. I have probably not characterized sufficiently well your position to offer a full menu of actions you may like to see taken regarding this issue.
tl;dr is that I’m basically curious 1) how much you think the risk is dominated by mismatch distribution applying specifically to x-risk vs say global poverty, 2) on which timeframes it is most important to shape the cause scope of the market in light of that (now? at full scale? both?), 3) whether banning x-risk topics from early impact markets (in ~2022) is a significant risk reducer by your lights.
(Meta note: I will drop in more links and quotes some time after publishing this.)
I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it’s very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations).
My best guess is that those particular two posts are net-positive (I haven’t read them entirely / at all). Of course, this does not imply that it’s net-positive to use these posts in a way that leads to the creation of an impact market.
In (3) you wrote “posts on the EA Forum/LW/Alignment Forum […] (minus stuff such as infohazards, etc)”. I think this description essentially assumes the problem away. Posts are merely information in a written form, so if you exclude all the posts that contain harmful information (i.e. info hazards), the remaining posts are by definition not net-negative. The hard part is to tell which posts are net-negative. (Or more generally, which interventions/projects are net-negative.)
The distribution mismatch problem is not caused by different people judging the EV differently. It would be relevant even if everyone in the world was in the same epistemic state. The problem is that if a project ends up being extremely harmful, its certificates end up being worth $0, same as if it ended up being neutral. Therefore, when market participants who follow their local financial incentives evaluate a project, they treat potential outcomes that are extremely harmful as if they were neutral. I’m happy to discuss this point further if you don’t agree with it. It’s the core argument in the OP, so I want to first reach an agreement about it before discussing possible courses of action.
Based on your arguments here, I wouldn’t be excited about impact markets. Still, if impact markets significantly expand the amount of work done for social good, it’s plausible the additional good outweighs the additional bad. Furthermore, people looking to make money are already funding net negative companies due to essentially the same problems (companies have non-negative evaluations), so shifting them towards impact markets could be good, if impact markets have better projects than existing markets on average.
See my reply to Austin.
I am going to be extremely busy over the next week as I prep for the EAEcon retreat and wrap up the retro funding contest among other things, and combining that with the generally high personal emotional cost of engaging with this post will choose to not comment further for at least a week to focus my energy elsewhere (unless inspiration strikes me).
Here are a couple of considerations relevant to why I at least have not been more responsive, generally speaking:
https://forum.effectivealtruism.org/posts/oxZ5yaZKkwEswbCNs/limits-to-legibility
https://forum.effectivealtruism.org/posts/KFMMRyk6sTFReaWjs/you-don-t-have-to-respond-to-every-comment
Is there any real-world evidence of the unilateralist’s curse being realised? My sense historically is that this sort of reasoning to date has been almost entirely hypothetical, and has done a lot to stifle innovation and exploration in the EA space.
If COVID-19 is a result of a lab leak that occurred while conducting a certain type of experiment (for the purpose of preventing future pandemics), perhaps many people considered conducting/funding such experiments and almost all of them decided not to.
I think we should be careful with arguments that such and such existential risk factor is entirely hypothetical. Causal chains that end in an existential catastrophe are entirely hypothetical and our goal is to keep them that way.
I’m talking about the unilateralist’s curse with respect to actions intended to be altruistic, not the uncontroversial claim that people sometimes do bad things. I find it hard to believe that any version of the lab leak theory involved all the main actors scrupulously doing what they thought was best for the world.
I think we should be careful with arguments that existential risk discussions require lower epistemic standards. That could backfire in all sorts of ways, and leads to claims like one I heard recently from a prominent player that a claim about artificial intelligence prioritisation for which I asked for evidence is ‘too important to lose to measurability bias’.
I don’t find it hard to believe at all. Conditional on a lab leak, I’m pretty confident no one involved was consciously thinking: “if we do this experiment it can end up causing a horrible pandemic, but on the other hand we can get a lot of citations.”
Dangerous experiments in virology are probably usually done in a way that involves a substantial amount of effort to prevent accidental harm. It’s not obvious that virologists who are working on dangerous experiments tend to behave much less scrupulously than people in EA who are working for Anthropic, for example. (I’m not making here a claim that such virologists or such people in EA are doing net-negative things.)
Strong disagree. A bioweapons lab working in secret on gain of function research for a somewhat belligerent despotic government, which denies everything after an accidental release is nowhere near any model I have of ‘scrupulous altruism’.
Ironically, the person I mentioned in my previous comment is one of the main players at Anthropic, so your second paragraph doesn’t give me much comfort.
I think that it’s more likely to be the result of an effort to mitigate potential harm from future pandemics. One piece of evidence that supports this is the grant proposal, which was rejected by DARPA, that is described in this New Yorker article. The grant proposal was co-submitted by the president of the EcoHealth Alliance, a non-profit which is “dedicated to mitigating the emergence of infectious diseases”, according to the article.
I don’t understand your sentence/reasoning here. Naively this should strengthen ofer’s claim, not weaken it.
Why? The less scrupulous one finds Anthropic in their reasoning, the less weight a claim that Wuhan virologists are ‘not much less scrupulous’ carries.
In 2015, when I was pretty new to EA, I talked to a billionaire founder of a company I worked at and tried to pitch them on it. They seemed sympathetic but empirically it’s been 7 years and they haven’t really done any EA donations or engaged much with the movement. I wouldn’t be surprised if my actions made it at least a bit harder for them to be convinced of EA stuff in the future.
In 2022, I probably wouldn’t do the same thing again, and if I did, I’d almost certainly try to coordinate a bunch more with the relevant professionals first. Certainly the current generation of younger highly engaged EAs seemed more deferential (for better or worse) and similar actions wouldn’t be in the Overton window.
Unless ~several people in EA had an opportunity to talk to that billionaire, I don’t think this is an example of the unilateralist’s curse (regardless of whether it was net negative for you to talk to them).
Fair, though many EAs are probably in positions where they can talk to other billionaires (especially with >5 hours of planning), and probably chose not to do so.