Lizka comments on Nobody’s on the ball on AGI alignment

Lizka 10 Apr 2023 13:01 UTC
35 points
10 ∶ 2
This comes late, but I appreciate this post and am curating it. I think the core message is an important one, some sections can help people develop intuitions for what the problems are, and the post is written in an accessible way (which is often not the case for AI safety-related posts). As others noted, the post also made a bunch of specific claims that others can disagree with as opposed to saying vague things or hedging a lot, which I also appreciate (see also epistemic legibility).
I share Charlie Guthmann’s question here: I get the sense that some work is in a fuzzy grey area between alignment and capabilities, so comparing the amount of work being done on safety vs. capabilities is difficult. I should also note that I don’t think all capabilities work can be defended as safety-relevant (see also my own post on safety-washing).
...
Quick note: I know Leopold — I don’t think this influenced my decision to curate the post, but FYI.
- Jan_Kulveit 13 Apr 2023 6:45 UTC
  34 points
  15 ∶ 8
  Parent
  In my view this is a bad decision.
  
  As I wrote on LW
  
  Sorry but my rough impression from the post is you seem to be at least as confused about where the difficulties are as average of alignment researchers you think are not on the ball—and the style of somewhat strawmanning everyone & strong words is a bit irritating.
  
  In particular I don’t appreciate the epistemic of these moves together
  
  1. Appeal to seeing thinks from close proximity. Then I got to see things more up close. And here’s the thing: nobody’s actually on the friggin’ ball on this one!
  2. Straw-manning and weakmaning what almost everyone else thinks and is doing
  3. Use of an emotionally compelling words like ‘real science’ for vaguely defined subjects where the content may be the opposite of what people imagine. Is the empirical alchemy-style ML type of research what’s advocated for as the real science?
  4. What overall sounds more like the aim is to persuade, rather than explain
  
  I think curating this signals this type of bad epistemics is fine, as long as you are strawmanning and misrepresenting others in a legible way and your writing is persuasive. Also there is no need to actually engage with existing arguments, you can just claim seeing things more up close.
  
  Also to what extent are moderator decisions influenced by status and centrality in the community...
  … if someone new and non-central to the community came up with this brilliant set of ideas how to solve AI safety:
  1. everyone working on it is not on the ball. why? they are all working on wrong things!
  2. promising is to do something very close to how empirical ML capabilities research works
  3. this is a type of problem where you can just throw money at it and attract better ML talent
  … I doubt this would have a high chance of becoming curated.
- Phib 13 Apr 2023 5:15 UTC
  4 points
  0 ∶ 0
  Parent
  Anecdata: thanks for curating, I didn’t read this when it first came through and now that I did, it really impacted me.
  
  Edit: Coming back after approaching it on LessWrong and now I’m very confused again—seems to have been much less well received. What someone here says is, “great balance of technical and generally legible content” over there might be considered “strawmanning and frustrating”, and I really don’t know what to think.
- Evan_Gaensbauer 15 Apr 2023 23:43 UTC
  3 points
  0 ∶ 0
  Parent
  As others noted, the post also made a bunch of specific claims that others can disagree with as opposed to saying vague things or hedging a lot, which I also appreciate (see also epistemic legibility).
  Thank you for acknowledging this and emphasizing the specific claims being made. I’m guessing you didn’t mean to cast aspersions through a euphemism. I’d respect you not being as explicit about it if that is part of what you meant here.
  For my part, though, I think you’re understating how much of a problem those other posts are, so I feel obliged to emphasize how the vagueness and hedging in some of those other posts has, wittingly or not, serving to spread hazardous misinformation. To be specific, here’s an excerpt from this other comment I made raising the same concern:
  Others who’ve tried to get across the same point [Leopold is] making have, instead of explaining their disagreements, have generally alleged almost everyone else in entire field of AI alignment are literally insane.
  [...]
  It counts as someone making a bold, senseless attempt to, arguably, dehumanize hundreds of their peers.
  
  This isn’t just a negligible error from somebody recognized as part of a hyperbolic fringe in AI safety/alignment community. It’s direly counterproductive when it comes from leading rationalists, like Eliezer Yudkowsky and Oliver Habryka, who wield great influence in their own right, and are taken very seriously by hundreds of other people.