Ryan Greenblatt answers What did AI Safety’s specific funding of AGI R&D labs lead to?

Ryan Greenblatt 5 Jul 2023 16:47 UTC
34 points
11 ∶ 7
I think OpenPhil’s grant to OpenAI is quite likely the best grant that OpenPhil has made in terms of counterfactual positive impact.

It’s worth noting that OpenPhil’s grant to OpenAI was done in order to acquire a board seat and generally to establish a relationship rather than being done because adding more money to OpenAI was a good use of funds at the margin.

See the grant write up here which discusses the motivation for the grant in detail.

Generally, I think influencing OpenAI was made notably easier via this grant (due to the board seat) and this influence seems quite good and has led to various good consequences (increased emphasis on AGI alignment for example).

The cost in dollars is quite cheap.

The main downside I can imagine is that this grant served as an implicit endorsement of OpenAI which resulted in a bunch of EAs working there which was then net-negative. My guess is that having these EAs work at OpenAI was probably good on net (due to a combination of acquiring influence and safety work—I don’t currently think the capabilities work was good on its own).
- NickLaing 5 Jul 2023 17:34 UTC
  17 points
  4 ∶ 2
  Parent
  Thanks Ryan. Obviously this is a topic where there will be a wide range of opinions. I would be interested to hear what as Holden Karnofsky thinks of this grant five years later. He may still have written about it, would appreciate if someone could point me to it if he has
  
  My big initial concern with both the grant proposal and your comment here is that both of you don’t mention perhaps the most important potential negative, that they grant could have played a role in accelerating the march towards dangerous Ai. Instead you mention EA “implicit endorsement” as being more important.
  
  Even if we assume for the moment that the effect of the grant was net positive in increasing the safety of OpenAI itself, what if it accelerated their progress just a little and helped create this dangerous race we are in. When the head of Microsoft says “the race is on” basically referring to chatGPT, if this grant made even a 0.001 percent contribution to speeding up that race, which seems plausible then the grant could st theill be strongly net negative.
  
  I don’t have a problem with your positive opinion (although I strongly disagree), but think it is good to engage with the stronger counterpoints, rather than what I think is a bit of a strawman with the “implicit endorsement” negative.
  - Ryan Greenblatt 5 Jul 2023 18:11 UTC
    28 points
    7 ∶ 0
    Parent
    
    if this grant made even a 0.001 percent contribution to speeding up that race, which seems plausible then the grant could st theill be strongly net negative.
    
    Suppose the grant made the race 0.001% faster overall, but made OpenAI 5% more focused on alignment. That seems like an amazingly good trade to me.
    
    This is quite sensitive to the exact quantitative details and I think the speed up is likely way, way more than 0.001%.
    - NickLaing 5 Jul 2023 18:39 UTC
      5 points
      0 ∶ 0
      Parent
      I really like this line of argument nice one.
      
      I’m not sure what trade-off I would take, it depends how much difference “focus on alignment” within a capabilities focused org is likely to make to improving safety. This i would defer to people who are far more enmeshed in this ecosystem.
      
      Instinctively I would put maybe a .1 percent speed up as a bigger net harm than a 5 percent focus on safety would be net good, but am so uncertain that i could even reverse that with a decent counterargument lol.
  - Ryan Greenblatt 5 Jul 2023 18:09 UTC
    13 points
    3 ∶ 1
    Parent
    
    Even if we assume for the moment that the effect of the grant was net positive in increasing the safety of OpenAI itself, what if it accelerated their progress just a little and helped create this dangerous race we are in. When the head of Microsoft says “the race is on” basically referring to chatGPT, if this grant made even a 0.001 percent contribution to speeding up that race, which seems plausible then the grant could st theill be strongly net negative.
    
    I don’t have a problem with your positive opinion (although I strongly disagree), but think it is good to engage with the stronger counterpoints, rather than what I think is a bit of a strawman with the “implicit endorsement” negative.
    
    Oh, I just think the effect of the 30 million dollars is way smaller than the total value of labor from EAs working at OpenAI such that the effect of the money is dominated by EAs being more likely to work there. I’m not confident in this, but the money seems pretty unimportant ex-post while the labor seems quite important.
    
    I think the speed up in timelines from people with EA/longtermist motivations working at OpenAI is more like 6 months to 3 years (I tend to think this speed up is bigger than other people I talk to). The speed up from money seems relatively tiny.
    
    Edit: It’s worth noting that this grant is not counterfactually responsible for most of these people working at (or continuing to work at) OpenAI, but I do think that the human capital is likely a more important consideration than the literal financial capital here because of the total magnitude of human capital being bigger.
    - NickLaing 5 Jul 2023 18:43 UTC
      9 points
      2 ∶ 0
      Parent
      That’s an interesting argument thanks, i wouldn’t have thought instinctively that the counterfactual of having EA people rather than others working at OpenAi would speed up timelines to that extent
      
      if that kind of speed up is close to accurate I would estimate the grant at even more net negative.
  - zchuang 5 Jul 2023 18:10 UTC
    8 points
    3 ∶ 5
    Parent
    Well OpenAI just announced that they’re going to spend 20% of their compute on alignment in the next four years so I think it’s paid off prima facie.
    - Will Aldred 18 Nov 2023 19:03 UTC
      4 points
      0 ∶ 0
      Parent
      I think the 20% figure, albeit a step in the right direction, is in reality a lot less impressive than it first sounds. From OpenAI’s “Introducing Superalignment” post:
      We are dedicating 20% of the compute we’ve secured to date over the next four years to solving the problem of superintelligence alignment.
      I expect that 20% of OpenAI’s 2023 compute will be but a tiny fraction of their 2027 compute, given that training compute has been growing by something like 4.2x/year.
  - burner 10 Jul 2023 11:22 UTC
    6 points
    0 ∶ 0
    Parent
    Dwarkesh Patel recently asked Holden about this:
    
    Dwarkesh Patel
    Are you talking about OpenAI? Yeah. Many people on Twitter might have asked if you were investing in OpenAI.
    Holden Karnofsky
    I mean, you can look up our $30 million grant to OpenAI. I think it was back in 2016–– we wrote about some of the thinking behind it. Part of that grant was getting a board seat for Open Philanthropy for a few years so that we could help with their governance at a crucial early time in their development. I think some people believe that OpenAI has been net negative for the world because of the fact that they have contributed a lot to AI advancing and to AI being sort of hyped, and they think that gives us less time to prepare for it. However, I do think that all else being equal, AI advancing faster gives us less time to prepare. It is a bad thing, but I don’t think it’s the only consideration. I think OpenAI has done a number of good things too, and has set some important precedents. I think it’s probably much more interested in a lot of the issues I’m talking about and risks from advanced AI than like the company that I would guess would exist if they didn’t, would be doing similar things.
    I don’t really accept that the idea that OpenAI is a negative force. I think it’s highly debatable. We could talk about it all day. If you look at our specific grant, it’s even a completely different thing because a lot of that was not just about boosting them, but about getting to be part of their early decision making. I think that was something that there were benefits and was important. My overall view is that I don’t look back on that grant as one of the better grants we’ve made, not one of the worse ones. But certainly we’ve done a lot of things that have had, you know, that have not worked out. I think there are some times shortly when we’ve done things that have consequences we didn’t intend. No philanthropist can be free of that. What we can try and do is be responsible, seriously do our homework to try to understand things beforehand, see the risks that we’re able to see, and think about how to minimize them.
    - NickLaing 10 Jul 2023 14:08 UTC
      3 points
      0 ∶ 0
      Parent
      Wow thanks so much.
      Basically he seems very, very uncertain about whether it is positive or negative.
      Very interesting
- Quadratic Reciprocity 24 Nov 2023 18:56 UTC
  3 points
  0 ∶ 0
  Parent
  Do you think it ended up having a net positive impact so far?
- peterbarnett 17 Nov 2023 21:20 UTC
  −7 points
  2 ∶ 4
  Parent
  Based.
  - Habryka 18 Nov 2023 5:20 UTC
    10 points
    1 ∶ 0
    Parent
    We’ll see. I am currently feeling a sense of doom around how this decision will play out. Substantial probability this will have been a really major blunder and cause an enormous amount of harm (and also non-negligible probability that it will be one of the best decisions made of all time).
    - Habryka 22 Nov 2023 19:02 UTC
      4 points
      0 ∶ 0
      Parent
      Well, mark my words.
    - JWS 🔸 18 Nov 2023 18:16 UTC
      2 points
      0 ∶ 0
      Parent
      For clarity, does ‘This decision’ mean the original grant for the board seat, or that historical board seat being used to oust Altman and Brockman from the company?
    - peterbarnett 19 Nov 2023 1:43 UTC
      1 point
      0 ∶ 1
      Parent
      Yeah, that’s reasonable, as of 5:36pm PST, November 18, 2023 it still seems like a good bet.
      I definitely am worried about either Sam Altman + Greg Brockman starting a new, less safety-focused lab, or Sam+Greg somehow returning to OpenAI and removing the safety-focused people from the board.
      Even with this, it seems pretty good to have safety-focused people with some influence over OpenAI. I’m a bit confused about situations where it’s like “Yes, it was good to get influence, but it turned out you made a bad tactical mistake and ended up making things worse.”