Remmelt comments on This might be the last AI Safety Camp

Remmelt Jan 27, 2024, 1:50 AM
2 points
0 ∶ 0
I’m worried about these having negative effects, making the AI safety people seem crazy, uninformed, or careless.

If you look at the projects, notice that each is carefully scoped.
1. ODD project is an engineering project for specifying the domain that a model should be designed for and used in.
2. Luddite Pro project is about journalism on current misuses of generative AI.
3. Lawyers project is about supporting creative professionals to litigate based on existing law (DMCA takedowns, item-level disclosures for EU AI Act, pre-litigation research for an EU lawsuit).
4. CMC project is about assessing (not carrying out) possible congressional messaging campaigns on the harms / non-safety of AI
The fourth project was on the edge for me. I had a few calls with the research lead and decided it was okay to go ahead if the RL managed to recruit applicants with expertise in policy communication (which they did!).

I prefer carefully scoped projects in this area, including for the concern you raised.

I’m mainly worried about this because Remmelt’s recent posting on LW really doesn’t seem like careful or well thought through communication.
Do you mean the posts early last year about fundamental controllability limits?
That’s totally fair – I did not do a good job at taking peoples’ perspectives into account in sharing new writings.

My mistake in part was presuming that since I’m in the same community, I could have more of an open conversation about it. I was hoping to put out a bunch of interesting posts, before putting out more rigorous explainers of the argumentation. Looking back, I should have spent way more time vetting and refining every (link)post. People’s attention is limited and you want to explain it well from their perspective right off the bat.

Later that year, I distilled the reasoning into a summary explanation. That got 47 upvotes on LW.
What links here?
- Remmelt's comment on This might be the last AI Safety Camp by Remmelt (Jan 25, 2024, 11:49 AM; 4 points)
- peterbarnett Jan 27, 2024, 2:34 AM
  2 points
  0 ∶ 0
  Parent
  Do you mean the posts early last year about fundamental controllability limits?
  Yep, that is what I was referring to. It does seem like you’re likely to be more careful in the future, but I’m still fairly worried about advocacy done poorly. (Although, like, I also think people should be able to advocacy if they want)
  - Remmelt Jan 27, 2024, 5:26 AM
    1 point
    0 ∶ 0
    Parent
    Yep, that is what I was referring to.
    Good that you raised this concern.
    
    It does seem like you’re likely to be more careful in the future
    Yes, I am more selective now in what I put out on the forums.
    
    In part, because I am having more one-on-one calls with (established) researchers.
    I find there is much more space to clarify and paraphrase that way.
    
    On the forums, certain write-ups seem to draw dismissive comments.
    Some combination of:
    (a) is not written by a friend or big name researcher.
    (b) requires some new counterintuitive reasoning steps.
    (c) leads to some unfavoured conclusion.
    
    For any two of those, writing can be hard but still doable.
    big name writes up counterintuitive reasoning toward an unfavoured conclusion.
    unfamiliar person writes up counterintuitive reasoning toward a favoured conclusion.
    unfamiliar person writes up obvious reasoning toward an unfavoured conclusion.
    
    In my case, for most readers it looks like:
    unfamiliar person writes up counterintuitive reasoning toward an unfavoured conclusion.
    
    There are just so many ways that can go wrong. The ways I tried to pre-empt it failed.
    - Remmelt Jan 27, 2024, 5:30 AM
      1 point
      0 ∶ 0
      Parent
      The ways I tried to pre-empt it failed.
      
      Ie.
      posting a sequence with familiar concepts to make the outside researcher more known to the community
      cautioning against jumping to judgements
      clarifying why alternatives to alignment make sense
      
      Looking back: I should have just held off until I managed to write one explainer (this one) that folks in my circles did not find extremely unintuitive.