David Johnston comments on A challenge for AGI organizations, and a challenge for readers

David Johnston 3 Dec 2022 0:53 UTC
6 points
2 ∶ 5
I don’t think this makes sense. Your group, in the EA community, regarding AI safety, gets taken seriously whatever you write. This in not the paradigmatic example of someone who feels worried about making public mistakes. A community that gives you even more leeway to do sloppy work is not one that encourages more people to share their independent thoughts about the problem. In fact, I think the reverse is true: when your criticisms carry a lot of weight even when they’re flawed, this has a stifling effect on people in more marginal positions who disagree with you.

If you want to promote more open discussion, your time would be far better spent seeking out flawed but promising work by lesser known individuals and pointing out what you think is valuable in it.

Am I correct in my belief that you are paid to do this work? If this is so, then I think the fact that you are both highly regarded and compensated for your time means your output should meet higher standards than a typical community post. Contacting the relevant labs is a step that wouldn’t take you much time, can’t be done by the vast majority of readers, and has a decent chance of adding substantial value. I think you should have done it.
- RobBensinger 3 Dec 2022 7:16 UTC
  5 points
  0 ∶ 0
  Parent
  What sort of substantial value would you expect to be added? It sounds like we either have a different belief about the value-add, or a different belief about the costs. Maybe if you sketched 2-3 scenarios that strike you as a relatively likely way for this particular post to have benefited from private conversations, I’d know better what the shape of our disagreement is.
  If your objection is less “this particular post would benefit” and more “every post that discusses an AGI org should run a draft by that org (at least if you’re doing EA work full-time)”, then I’d respond that stuff like “EAs candidly arguing about things back and forth in the comments of a post”, the 80K Podcast, and unredacted EA chat logs are extremely valuable contributions to EA discourse, and I think we should do far, far more things like that on the current margin.
  Writing full blog posts that are likewise “real” and likewise “part of a genuine public dialogue” can be valuable in much the same way; and some candid thoughts are a better fit for this format than for other formats, since some candid thoughts are more complicated, etc.
  It’s also important that intellectual progress like “long unedited chat logs” gets distilled and turned into relatively short, polished, and stable summaries; and it’s also important that people feel free to talk in private. But having some big chunks of the intellectual process be out in public is excellent for a variety of reasons. Indeed, I’d say that there’s more value overall in seeing EAs’ actual cognitive processes than in seeing EAs’ ultimate conclusions, when it comes to the domains that are most uncertain and disagreement-heavy (which include a lot of the most important domains for EAs to focus on today, in my view).
  This in not the paradigmatic example of someone who feels worried about making public mistakes. A community that gives you even more leeway to do sloppy work is not one that encourages more people to share their independent thoughts about the problem.
  I don’t think that sharing in-process snapshots of your views is “sloppy”, in the sense of representing worse epistemic standards than a not-in-process Finished Product.
  E.g., I wouldn’t say that a conversation on the 80K Podcast is more epistemically sloppy than a summary of people’s take-aways from the conversation. I think the opposite is often true, and people’s in-process conversations often reflect higher epistemic standards than their attempts to summarize and distill everything after-the-fact.
  In EA, being good at in-process, uncertain, changing, under-debate reasoning is more the thing I want to lead by example on. I think that hiding process is often setting a bad example for EAs, and making it harder for them to figure out what’s true.
  I agree that I’m not a paradigmatic example of the EAs who most need to hear this lesson; but I think non-established EAs heavily follow the example set by established EAs, so I want to set an example that’s closer to what I actually want to see more.
  In fact, I think the reverse is true: when your criticisms carry a lot of weight even when they’re flawed, this has a stifling effect on people in more marginal positions who disagree with you.
  If my reasoning process is actually flawed, then I want other EAs to be aware of that, so they can have an accurate model of how much weight to put on my views.
  If established EAs in general have such flawed reasoning processes (or such false beliefs) that rank-and-file EAs would be outraged and give up on the EA community if they knew this fact, then we should want to outrage rank-and-file EAs, in the hope that they’ll start something else that’s new and better. EA shouldn’t pretend to be better than it is; this causes way too many dysfunctions, even given that we’re unusually good in a lot of ways.
  (But possibly we agree about all that, and the crux here is just that you think sharing rougher or more uncertain thoughts is an epistemically bad practice, and I think it’s an epistemically good practice. So you see yourself as calling for higher standards, and I see you as calling for standards that are actually lower but happen to look more respectable.)
  If you want to promote more open discussion, your time would be far better spent seeking out flawed but promising work by lesser known individuals and pointing out what you think is valuable in it.
  That seems like a great idea to me too! I’d advocate for doing this along with the things I proposed above.
  Contacting the relevant labs is a step that wouldn’t take you much time, can’t be done by the vast majority of readers
  Is that actually true? Seems maybe true, but I also wouldn’t be surprised if >50% of regular EA Forum commenters can get substantive replies pretty regularly from knowledgeable DeepMind, OpenAI, and Anthropic staff, if they try sending a few emails.
  - David Johnston 3 Dec 2022 14:04 UTC
    11 points
    2 ∶ 1
    Parent
    What sort of substantial value would you expect to be added? It sounds like we either have a different belief about the value-add, or a different belief about the costs.
    I’d be very surprised if the actual amount of big-picture strategic thinking at either organisation was “very little”. I’d be less surprised if they didn’t have a consensus view about big-picture strategy, or a clearly written document spelling it out. If I’m right, I think the current content is misleading-ish. If I’m wrong and actually little thinking has been done—there’s some chance they say “we’re focused on identifying and tackling near-term problems”, which would be interesting to me given what I currently believe. If I’m wrong and something clear has been written, then making this visible (or pointing out its existence) would also be a useful update for me.
    Polished vs sloppy
    Here are some dimensions I think of as distinguishing sloppy from polished:
    Vague hunches <-> precise theories
    First impressions <-> thorough search for evidence/prior work
    Hard <-> easy to understand
    Vulgar <-> polite
    Unclear <-> clear account of robustness, pitfalls and so forth
    All else equal, I don’t think the left side is epistemically superior. It can be faster, and that might be worth it, but there are obvious epistemic costs to relying on vague hunches, first impressions, failures of communication and overlooked pitfalls (politeness is perhaps neutral here). I think these costs are particularly high in, as you say, domains that are uncertain and disagreement-heavy.
    I think it is sloppy to stay too close to the left if you think the issue is important and you have time to address it properly. You have to manage your time, but I don’t think there are additional reasons to promote sloppy work.
    You say that there are epistemic advantages to exposing thought processes, and you give the example of dialogues. I agree there are pedagogical advantages to exposing thought processes, but exposing thoughts clearly also requires polish, and I don’t think pedagogy is a high priority most of the time. I’d be way more excited to see more theory from MIRI than more dialogues.
    If my reasoning process is actually flawed, then I want other EAs to be aware of that, so they can have an accurate model of how much weight to put on my views.
    I don’t think it’s realistic to expect Lightcone forums to do serious reviews of difficult work. That takes a lot of individual time and dedication; maybe you occasionally get lucky, but you should mostly expect not to.
    I agree that I’m not a paradigmatic example of the EAs who most need to hear this lesson [of exposing the thought process]; but I think non-established EAs heavily follow the example set by established EAs, so I want to set an example that’s closer to what I actually want to see more of
    Maybe I’ll get into this more deeply one day, but I just don’t think sharing your thoughts freely is a particularly effective way to encourage other people to share theirs. I think you’ve been pretty successful at getting the “don’t worry about being polite to OpenAI” message across, less so the higher level stuff.
    - RobBensinger 3 Dec 2022 21:08 UTC
      4 points
      0 ∶ 0
      Parent
      I agree with a lot of what you say! I still want to move EA in the direction of “people just say what’s on their mind on the EA Forum, without trying to dot every i and cross every t; and then others say what’s on their mind in response; and we have an actual back-and-forth that isn’t carefully choreographed or extremely polished, but is more like a real conversation between peers at an academic conference”.
      (Another way to achieve many of the same goals is to encourage more EAs who disagree with each other to regularly talk to each other in private, where candor is easier. But this scales a lot more poorly, so it would be nice if some real conversation were happening in public.)
      A lot of my micro-decisions in making posts like this are connected to my model of “what kind of culture and norms are likely to result in EA solving the alignment problem (or making a lot of progress)?”, since I think that’s the likeliest way that EA could make a big positive difference for the future. In that context, I think building conversations about heavily polished, “final” (rather than in-process) cognition, tends to be insufficient for fast and reliable intellectual progress:
      Highly polished content tends to obscure the real reasons and causes behind people’s views, in favor of reasons that are more legible, respectable, impressive, etc. (See Beware defensibility.)
      AGI alignment is a pre-paradigmatic proto-field where making good decisions will probably depend heavily on people having good technical intuitions, intuiting patterns before they know how to verbalize those patterns, and generally becoming adept at noticing what their gut says about a topic and putting their gut in contact with useful feedback loops so it can update and learn.
      In that context, I’m pretty worried about an EA where everyone is hyper-cautious about saying anything that sounds subjective, “feelings-ish”, hard-to-immediately-transmit-to-others, etc. That might work if EA’s path to improving the world is via donating more money to AMF or developing better vaccine tech, but it doesn’t fly if making (and fostering) conceptual progress on AI alignment is the path to impact.
      Ideally, it shouldn’t merely be the case that EA technically allows people to candidly blurt out their imperfect, in-process thoughts about things. Rather, EA as a whole should be organized around making this the expected and default culture (at least to the degree that EAs agree with me about AI being a top priority), and this should be reflected in a thousand small ways in how we structure our conversation. Normal EA Forum conversations should look more like casual exchanges between peers at an academic conference, and less like polished academic papers (because polished academic papers are too inefficient a vehicle for making early-stage conceptual progress).
      I think this is not only true for making direct AGI alignment progress, but is also true for converging about key macrostrategy questions (hard vs. soft takeoff; overall difficulty of the alignment; probability of a sharp left turn; impressiveness of GPT-3; etc.). Insofar as we haven’t already converged a lot on these questions, I think a major bottleneck is that we’ve tried too much to make our reasoning sound academic-paper-ish before it’s really in that format, with the result that we confuse ourselves about our real cruxes, and people end up updating a lot less than they would in a normal back-and-forth.
      Highly polished, heavily privately reviewed and edited content tends to reflect the beliefs of larger groups, rather than the beliefs of a specific individual.
      This often results in deference cascades, double-counting evidence, and herding: everyone is trying (to some degree) to bend their statements in the direction of what everyone else thinks. I think it also often creates “phantom updates” in EA, where there’s a common belief that X is widely believed, but the belief is wrong to some degree (at least until everyone updates their outside views because they think other EAs believe X).
      It also has various directly distortionary effects (e.g., a belief might seem straightforwardly true to all the individuals at an org, but doesn’t feel like “the kind of thing” an organization writ large should endorse).
      In principle, it’s not impossible to push EA in those directions while also passing drafts a lot more in private. But I hope it’s clearer why that doesn’t seem like the top priority to me (and why it could be at least somewhat counter-productive) given that I’m working with this picture of our situation.
      I’m happy to heavily signal-boost replies from DM and Anthropic staff (including editing the OP), especially if it shows that MIRI was just flatly wrong about how much those orgs already have a plan. And I endorse people docking MIRI points insofar as we predicted wrongly, here; and I’d prefer the world where people knew our first-order impressions of where the field’s at in this case, and were able to dock us some points if we turn out to be wrong, as opposed to the world where everything happens in private.
      (I think I still haven’t communicated fully why I disagree here, but hopefully the pieces I have been able to articulate are useful on their own.)
- lauren 3 Dec 2022 19:47 UTC
  3 points
  0 ∶ 0
  Parent
  this approach to reasoning assumes authorities are valid. do not trust organizations this way. It is one of effective altruism’s key failings. how can we increase pro-social distrust in effective altruism so that authorities are not trusted?
- pseudonym 3 Dec 2022 4:38 UTC
  2 points
  1 ∶ 0
  Parent
  I would be curious to hear the pushbacks from people who disagree-voted this!