Habryka comments on Two reasons we might be closer to solving alignment than it seems

Habryka Sep 25, 2022, 6:39 AM
18 points
4 ∶ 3
I think more people in a worldwide population generally leads to more innovation, but primarily in domains where there is a lot of returns to scale, and where you have a lot of incentives for people to make progress towards. If you want to have get people to explore a specific problem, I think more people rarely helps (because the difficulty lies in aiming people at the problem, not in the firepower you have).
I think adding more people also rarely causes more exploration to happen. Large companies are usually much less innovative than small companies. Coordinating large groups of people usually requires conformity and because of asymmetries in how easy it is to cause harm to a system vs. to produce value, requires widespread conservativism in order to function. I think similar things are happening in EA, where the larger EA gets, the more people are concerned about someone “destroying the reputation of the community” the more people have to push on the brakes in order to prevent anyone from taking risky action.
I think there exist potential configurations of a research field that can scale substantially better, but I don’t think we are currently configured that way, and I expect by default exploration to go down as scale goes up (in general, the number of promising new research agendas and direction seems to me to have gone down a lot during the last 5 years as EA has grown a lot, and this is a sentiment I’ve heard mirrored from most people who have been engaged for that long).
- Greg_Colbourn ⏸️ Sep 25, 2022, 12:02 PM
  14 points
  4 ∶ 0
  Parent
  in general, the number of promising new research agendas and direction seems to me to have gone down a lot during the last 5 years
  At least in technical AI Alignment, the opposite seems to have happened in the last couple of years. It looks like we’re in the midst of a Cambrian explosion of research groups and agendas. Or would you argue that most of these aren’t promising?
  - Habryka Sep 25, 2022, 7:46 PM
    13 points
    1 ∶ 2
    Parent
    There is a cambrian explosion of research groups, but basically no new agendas as far as I can tell? Of the agendas listed on that post, I think basically all are 5+ years old (some have morphed, like ELK is a different take on scalable oversight than Paul had 5 years ago, but I would classify it as the same agenda).
    There is a giant pile of people working on the stuff, though the vast majority of new work can be characterized “let’s just try to solve some near-term alignment problems and hope that it somehow informs our models of long-term alignment problems” and a large pile of different types of transparency research. I think there are good cases for that work, though I am not very optimistic about it helping with existential risk.
    - Kat Woods Sep 26, 2022, 1:01 PM
      7 points
      0 ∶ 0
      Parent
      That’s really interesting and unexpected! Seems worth figuring out why that’s happening. What are your top hypotheses for why that’s happening?
      My first guess would be epistemic humility norms.
      My second would be that the first people in a field are often disproportionately talented compared to people coming in later. (Although you could also tell a story about how at the beginning it’s too socially weird so it can’t attract a lot of top talent).
      My third is that since alignment is so hard, it’s easier for people to latch onto existing research agendas instead of creating new ones. At the beginning there were practically no agendas to latch onto, so people had to make new ones, but now there are a few, so most people just sort themselves into those.
    - Greg_Colbourn ⏸️ Sep 26, 2022, 7:04 AM
      5 points
      0 ∶ 0
      Parent
      Are there any promising directions for AGI x-risk reduction that you are aware of that aren’t being (significantly) explored?
- Kat Woods Sep 25, 2022, 12:00 PM
  13 points
  5 ∶ 1
  Parent
  Large companies are usually much less innovative than small companies
  I think this is still in the framework of thinking that large groups of people having to coordinate leads to stagnation. To change my mind, you’d have to make the case that having a larger number of startups leads to less innovation, which seems like a hard case to make.
  the larger EA gets, the more people are concerned about someone “destroying the reputation of the community”
  I think this is a separate issue that might be caused by the size of the movement, but a different hypothesis is that it’s simply an idea that has traction in the movement. One which has been around for a long time, even while we were a lot smaller. Spending your “weirdness points” and such considerations have been around since the very beginning.
  (On a side note, I think we’re overly concerned about this, but that’s a whole other post. Suffice to say here that a lot of the probability mass is on this not being caused by the size of the movement, but rather a particularly sticky idea)
  I think there exist potential configurations of a research field that can scale substantially better, but I don’t think we are currently configured that way
  🎯 I 100% agree. I’m thinking of spending some more time thinking on and writing up ways that we could make it so the movement could usefully take on more researchers. I also encourage others to think on this, because it could unlock a lot of potential.
  I expect by default exploration to go down as scale goes up
  I think this is where we disagree. It’d be very surprising if ~150 researchers is the optimal amount, or that having less would lead to more innovation and more/better research agendas.
  in general, the number of promising new research agendas and direction seems to me to have gone down a lot during the last 5 years as EA has grown a lot, and this is a sentiment I’ve heard mirrored from most people who have been engaged for that long
  An alternative hypothesis is that people you’ve been talking to have been becoming more pessimistic about having hope at all (if you hang out with MIRI folk a lot, I’d expect this to be more acute). It might not be because there’s more people having bad ideas or that having more people in the movement leads to a decline in quality, but rather a certain contingency think alignment is impossible or deeply improbable, so that all ideas seem bad. In this paradigm/POV, the default is that all new research agendas seem bad. It’s not that the agendas got worse. It’s that people think the problem is even harder than they originally thought.
  Another hypothesis is that the idea of epistemic humility has been spreading, combined with the idea that you need intensive mentorship. This leads to new people coming in being less likely to actually come up with new research agendas, but rather to defer to authority. (A whole other post there!)
  Anyways, just some alternatives to consider :) It’s hard to convey tone over text, but I’m enjoying this discussion a lot and you should read all my writing assuming a lot of warmth and engagement. :)
  - Habryka Sep 25, 2022, 7:53 PM
    7 points
    1 ∶ 0
    Parent
    I think this is still in the framework of thinking that large groups of people having to coordinate leads to stagnation. To change my mind, you’d have to make the case that having a larger number of startups leads to less innovation, which seems like a hard case to make.
    I think de-facto right now people have to coordinate in order to do work on AI Alignment, because most people need structure and mentorship and guidance to do any work, and want to be part of a coherent community.
    Separately, I also think many startup communities are indeed failing to be innovative because of their size and culture. Silicon Valley is a pretty unique phenomenon, and I’ve observed “startup communities” in Germany that felt to me liked they harmed innovation more than they benefitted it. The same is true for almost any “startup incubator” that large universities are trying to start up. When I visit them, I feel like the culture there primarily encourage conformity and chasing the same proxy metrics as everyone else.
    I think actually creating a startup ecosystem is hard, and I think it’s still easier than creating a similar ecosystem for something as ill-defined as AI Alignment. The benefit that startups have is that you can very roughly measure success by money, at least in the long run, and this makes it pretty easy to point many people at the problem (and like, creates strong incentives for people to point themselves at the problem).
    I think we have no similar short pointer for AI Alignment, and most people who start working in the field seem to me to be quite confused about what the actual problem to be solved is, and then often just end up doing AI capabilities research while slapping an “AI Alignment” label on it, and I think scaling that up mostly just harms the world.
    - Habryka Sep 25, 2022, 8:14 PM
      24 points
      2 ∶ 0
      Parent
      I think we should generally have a prior that social dynamics of large groups of people end up pushing heavily towards conformity, and that those pressures towards conformity can cancel out many orders of magnitude of growth of the number of people who could theoretically explore different directions.
      As a concrete case study, I like this Robin Hanson post “The World Forager Elite”:
      The world has mostly copied bad US approaches to over-regulating planes as well. We also see regulatory convergence in topics like human cloning; many had speculated that China would be defy the consensus elsewhere against it, but that turned out not to be true. Public prediction markets on interesting topics seems to be blocked by regulations almost everywhere, and insider trading laws are most everywhere an obstacle to internal corporate markets.
      Back in February we saw a dramatic example of world regulatory coordination. Around the world public health authorities were talking about treating this virus like they had treated all the others in the last few decades. But then world elites talked a lot, and suddenly they all agreed that this virus must be treated differently, such as with lockdowns and masks. Most public health authorities quickly caved, and then most of the world adopted the same policies. Contrarian alternatives like variolation, challenge trials, and cheap fast lower-reliability tests have also been rejected everywhere; small experiments have not even been allowed.
      One possible explanation for all this convergence is that regulators are just following what is obviously the best policy. But if you dig into the details you will quickly see that the usual policies are not at all obviously right. Often, they seem obviously wrong. And having all the regulatory bodies suddenly change at once, even when no new strong evidence appeared, seems especially telling.
      It seems to me that we instead have a strong world culture of regulators, driven by a stronger world culture of elites. Elites all over the world talk, and then form a consensus, and then authorities everywhere are pressured into following that consensus. Regulators most everywhere are quite reluctant to deviate from what most other regulators are doing; they’ll be blamed far more for failures if they deviate. If elites talk some more, and change their consensus, then authorities must then change their polices. On topic X, the usual experts on X are part of that conversation, but often elites overrule them, or choose contrarians from among them, and insist on something other than what most X experts recommend.
      The number of nations, as well as the number of communities and researchers that were capable of doing innovative things in response to COVID was vastly greater in 2020 than for any previous pandemic. But what we saw was much less global variance and innovation in pandemic responses. I think there was scientific innovation, and that innovation was likely greater than for previous pandemics, but overall, despite the vastly greater number of nations and people in the international community of 2020, this only produced more risk-aversion in stepping out of line with elite consensus.
      I think by-default we should expect similar effects in fields like AI Alignment. I think maintaining a field that is open to new ideas and approaches is actively difficult. If you grow the field without trying to preserve the concrete and specific mechanisms that are in place to allow innovation to grow, more people will not result in more innovation, it will result in less, even from the people that have previously been part of the same community.
      In the case of COVID, the global research community spent a substantial fraction of its effort on actively preventing people from performing experiments like variolation or challenge trials, and we see the same in fields like Psychology research where a substantial fraction of energy is spent on ever-increasing ethical review requirements.
      We see the same in the construction industry (a recent strong interest of mine), which despite its quickly growing size, is performing substantially fewer experiments than it was 40 years ago, and is spending most of its effort actively regulating what other people in the industry can do, and limiting the type of allowable construction materials and approaches to smaller and smaller sets.
      I think by-default, I expect fast growth of the AI Alignment community to reduce innovation for the same reasons. I expect a larger community will increase pressures towards forming an elite consensus, and that consensus will be enforced via various legible and illegible means. Most of the world is really not great at innovation, and the default outcome of large groups of people, even when pointed towards a shared goal, is not innovation, but conformity, and if we recklessly grow, I think we will default towards the same common outcome.
      - Greg_Colbourn ⏸️ Sep 26, 2022, 7:25 AM
        2 points
        0 ∶ 0
        Parent
        Re conformity, I wonder if related arguments could help shift the Future Funds’ worldview?
      - Greg_Colbourn ⏸️ Sep 26, 2022, 7:17 AM
        2 points
        0 ∶ 0
        Parent
        This is a good point, and a wake up call for us to do better with AI Alignment. Given the majority of funding for AGI x-safety is coming from within EA right now, and as a community we are accutely aware of the failings with Covid, we should be striving to do better.
      - sphor Sep 26, 2022, 4:15 AM
        1 point
        0 ∶ 0
        Parent
        Back in February we saw a dramatic example of world regulatory coordination. Around the world public health authorities were talking about treating this virus like they had treated all the others in the last few decades. But then world elites talked a lot, and suddenly they all agreed that this virus must be treated differently, such as with lockdowns and masks. Most public health authorities quickly caved, and then most of the world adopted the same policies.
        Is there any legible evidence for this?
        Greg_Colbourn ⏸️ Sep 26, 2022, 7:13 AM
        3 points
        0 ∶ 0
        Parent
        There was some deviation (e.g. no lockdowns in Sweden), but the most telling thing was no human challenge trials anywhere in the world. That alone was a tragedy that prolonged the pandemic by months (by delaying roll-out of vaccines) and caused millions of deaths.