Centralised coordination/control is a way to counteract that.
OpenPhil funding OpenAI might be a case of a “central” organization taking unilateral action that’s harmful. vollmer also mentions that he thinks some of EAF’s subprojects were probably negative impact elsewhere in this thread—presumably the EAF is relatively “central”.
If we think that “individuals underestimate potential downsides relative to their estimations concerning potential upsides”, why do we expect funders to be immune to this problem? There seems to be an assumption that if you have a lot of money, you are unusually good at forecasting potential downsides. I’m not sure. People like Joey and Paul Christiano have offered prizes for the best arguments against their beliefs. I don’t believe OpenPhil has ever done this, despite having a lot more money.
In general, funding doesn’t do much to address the unilateralist’s curse because any single funder can act unilaterally to fund a project that all the other funders think is a bad idea. I once proposed an EA donor’s league to address this problem, but people weren’t too keen on it for some reason.
it doesn’t seem clear to me that “the fact that some EA thinks it’s a good idea” is sufficient grounds to attribute positive expected value to a project, given no other information
Here’s a thought experiment that might be helpful as a baseline scenario. Imagine you are explaining effective altruism to a stranger in a loud bar. After hearing your explanation, the stranger responds “That’s interesting. Funny thing, I gave no thought to EA considerations when choosing my current project. I just picked it because I thought it was cool.” Then they explain their project to you, but unfortunately, the bar is too loud for you to hear what they say, so you end up just nodding along pretending to understand. Now assume you have two options: you can tell the stranger to ditch their project, or you can stay silent. For the sake of argument, let’s assume that if you tell the stranger to ditch their project, they will ditch it, but they will also get soured on EA and be unreceptive to EA messages in the future. If you stay silent, the stranger will continue their project and remain receptive to EA messages. Which option do you choose?
My answer is, having no information about the stranger’s project, I have no particular reason to believe it will be either good or bad for the world. So I model the stranger’s project as a small random perturbation on humanity’s trajectory, of the sort that happen thousands of times per day. I see the impact of such perturbations as basically neutral on expectation. In the same way the stranger’s project could have an unexpected downside, it could also have an unexpected upside. And in the same way that the stranger’s actions could have some nasty unforeseen consequence, my action of discouraging the stranger could also have some nasty unforeseen consequence! (Nasty unforeseen consequences of my discouragement action probably won’t be as readily observable, but that doesn’t mean they won’t exist.) So I stay silent, because I gain nothing on expectation by objecting to the project, and I don’t want to pay the cost of souring the stranger on EA.
Suppose you agree with my argument above. If so, do you think that we should default to discouraging EAs from doing projects in the absence of further information? Why? It seems a bit counterintuitive/implausible that being part of the EA community would increase the odds that someone’s project creates a downside. If anything, it seems like being plugged into the community should increase a person’s awareness of how their project might pose a risk. (Consider the EA hotel in comparison to an alternative of having people live cheaply as individuals. Being part of a community of EAs = more peer eyeballs on your project = more external perspectives to spot unexpected downsides.) And in the same way giving strangers default discouragement will sour them on EA, giving EAs default discouragement on doing any kind of project seems like the kind of thing that will suck the life out of the movement.
I don’t want to be misinterpreted, so to clarify:
I am in favor of people discouraging projects if, after looking at the project, they actually think the project will be harmful.
I am in favor of bringing up considerations that suggest a project might be harmful with the people engaged in it, even if you aren’t sure about the project’s overall impact.
I’m in favor of people trying to make the possibility of downsides mentally available, so folks will remember to check for them.
I’m in favor of more people doing what Joey does and offering prizes for arguments that their projects are harmful.
I’m in favor of people publicly making themselves available to shoot holes in the project ideas of others.
I’m in favor of people in the EA community trying to coordinate more effectively, engage in moral trade, cooperate in epistemic prisoner’s dilemmas, etc.
In general, I think brainstorming potential downsides has high value of information and people should do it more. But a gamble can still be positive expected value without having purchased any information! (Also, in order to avoid bias, maybe you should try to spend an equal amount of time brainstorming unexpected upsides.)
I think it may be reasonable to focus on projects which have passed the acid test of trying to think of plausible downsides (since those projects are likely to be higher expected value).
But I don’t really see what purpose blanket discouragement serves.
The OpenPhil/OpenAI article was a good read, thanks, although I haven’t read the comments on either post or Ben’s latest thoughts, and I don’t really have an opinion either way on the value/harm of OpenPhil funding OpenAI if they did so “to buy a seat on OpenAI’s board for Open Philanthropy Project executive director Holden Karnofsky”. But of course, I wasn’t suggesting that centralised action is never harmful; I was suggesting that it’s better on average [edit: in UC-type scenarios, which I’m not sure your two examples were...man this stuff is confusing!]. It’s also ironic that part of the reason funding OpenAI might have been a bad idea seems to be that it creates more of a Unilateralist’s Curse scenario (although I did notice that the first comment claims this is not their current strategy): “OpenAI’s primary strategy is to hire top AI researchers to do cutting-edge AI capacity research and publish the results, in order to ensure widespread access.”
If we think that “individuals underestimate potential downsides relative to their estimations concerning potential upsides”, why do we expect funders to be immune to this problem?
Excellent question. No strong opinion as I’m still in anecdote territory here, but I reckon emotional attachment to one’s own grand ideas is what’s driving the underestimation of risk, and you’d expect funders to be able to assess ideas more dispassionately.
I’m not sure that EA is all that relevant to the answer I’d give in your thought experiment. If they didn’t have much power then I’d say go for it. If their project would have large consequences before anyone else could step in I’d say stop. As I said before, “I currently still think the EA Hotel has positive expected value—I don’t think it’s giving individuals enough power for the Unilateralist’s Curse to really apply.” I genuinely do expect the typical idea someone has for improving the status quo to be harmful, whether they’re an EA or a stranger in a bar. Most of the time it’s good to encourage innovation anyway, because there are feedback mechanisms/power structures in place to stop things getting out of hand if they start to really not look like good ideas. But in UC-type scenarios i.e. where those checks are not in place, we have a problem.
We might be talking past each other. Perhaps we agree that: In your typical real-life scenario i.e. where an individual does not have unilateral power, we should encourage them to pursue their altruistic ideas. Perhaps this was even what you were saying originally, and I just misinterpreted it.
[Edit: I’m pretty sure we’re talking past each other to at least some extent. I don’t think there should be “blanket discouragement”. I think the typical project that someone/an EA thinks is a good idea is in fact a bad idea, but that they should test it anyway. I do think there should be blanket discouragement of actions with large consequences that can be taken by a small minority without the endorsement of others (eg. relating to reputational risk or information hazards).]
OpenPhil funding OpenAI might be a case of a “central” organization taking unilateral action that’s harmful. vollmer also mentions that he thinks some of EAF’s subprojects were probably negative impact elsewhere in this thread—presumably the EAF is relatively “central”.
If we think that “individuals underestimate potential downsides relative to their estimations concerning potential upsides”, why do we expect funders to be immune to this problem? There seems to be an assumption that if you have a lot of money, you are unusually good at forecasting potential downsides. I’m not sure. People like Joey and Paul Christiano have offered prizes for the best arguments against their beliefs. I don’t believe OpenPhil has ever done this, despite having a lot more money.
In general, funding doesn’t do much to address the unilateralist’s curse because any single funder can act unilaterally to fund a project that all the other funders think is a bad idea. I once proposed an EA donor’s league to address this problem, but people weren’t too keen on it for some reason.
Here’s a thought experiment that might be helpful as a baseline scenario. Imagine you are explaining effective altruism to a stranger in a loud bar. After hearing your explanation, the stranger responds “That’s interesting. Funny thing, I gave no thought to EA considerations when choosing my current project. I just picked it because I thought it was cool.” Then they explain their project to you, but unfortunately, the bar is too loud for you to hear what they say, so you end up just nodding along pretending to understand. Now assume you have two options: you can tell the stranger to ditch their project, or you can stay silent. For the sake of argument, let’s assume that if you tell the stranger to ditch their project, they will ditch it, but they will also get soured on EA and be unreceptive to EA messages in the future. If you stay silent, the stranger will continue their project and remain receptive to EA messages. Which option do you choose?
My answer is, having no information about the stranger’s project, I have no particular reason to believe it will be either good or bad for the world. So I model the stranger’s project as a small random perturbation on humanity’s trajectory, of the sort that happen thousands of times per day. I see the impact of such perturbations as basically neutral on expectation. In the same way the stranger’s project could have an unexpected downside, it could also have an unexpected upside. And in the same way that the stranger’s actions could have some nasty unforeseen consequence, my action of discouraging the stranger could also have some nasty unforeseen consequence! (Nasty unforeseen consequences of my discouragement action probably won’t be as readily observable, but that doesn’t mean they won’t exist.) So I stay silent, because I gain nothing on expectation by objecting to the project, and I don’t want to pay the cost of souring the stranger on EA.
Suppose you agree with my argument above. If so, do you think that we should default to discouraging EAs from doing projects in the absence of further information? Why? It seems a bit counterintuitive/implausible that being part of the EA community would increase the odds that someone’s project creates a downside. If anything, it seems like being plugged into the community should increase a person’s awareness of how their project might pose a risk. (Consider the EA hotel in comparison to an alternative of having people live cheaply as individuals. Being part of a community of EAs = more peer eyeballs on your project = more external perspectives to spot unexpected downsides.) And in the same way giving strangers default discouragement will sour them on EA, giving EAs default discouragement on doing any kind of project seems like the kind of thing that will suck the life out of the movement.
I don’t want to be misinterpreted, so to clarify:
I am in favor of people discouraging projects if, after looking at the project, they actually think the project will be harmful.
I am in favor of bringing up considerations that suggest a project might be harmful with the people engaged in it, even if you aren’t sure about the project’s overall impact.
I’m in favor of people trying to make the possibility of downsides mentally available, so folks will remember to check for them.
I’m in favor of more people doing what Joey does and offering prizes for arguments that their projects are harmful.
I’m in favor of people publicly making themselves available to shoot holes in the project ideas of others.
I’m in favor of people in the EA community trying to coordinate more effectively, engage in moral trade, cooperate in epistemic prisoner’s dilemmas, etc.
In general, I think brainstorming potential downsides has high value of information and people should do it more. But a gamble can still be positive expected value without having purchased any information! (Also, in order to avoid bias, maybe you should try to spend an equal amount of time brainstorming unexpected upsides.)
I think it may be reasonable to focus on projects which have passed the acid test of trying to think of plausible downsides (since those projects are likely to be higher expected value).
But I don’t really see what purpose blanket discouragement serves.
The OpenPhil/OpenAI article was a good read, thanks, although I haven’t read the comments on either post or Ben’s latest thoughts, and I don’t really have an opinion either way on the value/harm of OpenPhil funding OpenAI if they did so “to buy a seat on OpenAI’s board for Open Philanthropy Project executive director Holden Karnofsky”. But of course, I wasn’t suggesting that centralised action is never harmful; I was suggesting that it’s better on average [edit: in UC-type scenarios, which I’m not sure your two examples were...man this stuff is confusing!]. It’s also ironic that part of the reason funding OpenAI might have been a bad idea seems to be that it creates more of a Unilateralist’s Curse scenario (although I did notice that the first comment claims this is not their current strategy): “OpenAI’s primary strategy is to hire top AI researchers to do cutting-edge AI capacity research and publish the results, in order to ensure widespread access.”
Excellent question. No strong opinion as I’m still in anecdote territory here, but I reckon emotional attachment to one’s own grand ideas is what’s driving the underestimation of risk, and you’d expect funders to be able to assess ideas more dispassionately.
I’m not sure that EA is all that relevant to the answer I’d give in your thought experiment. If they didn’t have much power then I’d say go for it. If their project would have large consequences before anyone else could step in I’d say stop. As I said before, “I currently still think the EA Hotel has positive expected value—I don’t think it’s giving individuals enough power for the Unilateralist’s Curse to really apply.” I genuinely do expect the typical idea someone has for improving the status quo to be harmful, whether they’re an EA or a stranger in a bar. Most of the time it’s good to encourage innovation anyway, because there are feedback mechanisms/power structures in place to stop things getting out of hand if they start to really not look like good ideas. But in UC-type scenarios i.e. where those checks are not in place, we have a problem.
We might be talking past each other. Perhaps we agree that: In your typical real-life scenario i.e. where an individual does not have unilateral power, we should encourage them to pursue their altruistic ideas. Perhaps this was even what you were saying originally, and I just misinterpreted it.
[Edit: I’m pretty sure we’re talking past each other to at least some extent. I don’t think there should be “blanket discouragement”. I think the typical project that someone/an EA thinks is a good idea is in fact a bad idea, but that they should test it anyway. I do think there should be blanket discouragement of actions with large consequences that can be taken by a small minority without the endorsement of others (eg. relating to reputational risk or information hazards).]