peterbarnett comments on This might be the last AI Safety Camp

peterbarnett 26 Jan 2024 20:28 UTC
12 points
4 ∶ 0
I have similar views to Marius’s comment. I did AISC in 2021 and I think it was somewhat useful for starting in AI safety, although I think my views and understanding of the problems were pretty dumb in hindsight.
AISC does seem extremely cheap (at least for the budget options). If you have like 80% on the “Only top talent matters” model (MATS, Astra, others) and 20% on the “Cast a wider net” model (AISC), I would still guess that AISC seems like a good thing to do.
My main worries here are with the negative effects. These are mainly related to the “To not build uncontrollable AI” stream; 3 out of 4 of these seem to be about communication/politics/advocacy.^[1] I’m worried about these having negative effects, making the AI safety people seem crazy, uninformed, or careless. I’m mainly worried about this because Remmelt’s recent posting on LW really doesn’t seem like careful or well thought through communication. (In general I think people should be free to do advocacy etc, although please think of externalities) Part of my worry is also from AISC being a place for new people to come, and new people might not know how fringe these views are in the AI safety community.
I would be more comfortable with these projects (and they would potentially still be useful!) if they were focused on understanding the things they were advocating for more. E.g. a report on “How could lawyers and coders stop AI companies using their data?”, rather than attempting to start an underground coalition.
All the projects in the “Everything else” streams (run by Linda) seem good or fine, and likely a decent way to get involved and start thinking about AI safety. Although, as always, there is a risk of wasting time with projects that end up being useless.
[ETA: I do think that AISC is likely good on net.]
1. ^
  The other one seems like a fine/non-risky project related to domain whitelisting.
- Remmelt 27 Jan 2024 1:50 UTC
  2 points
  0 ∶ 0
  Parent
  I’m worried about these having negative effects, making the AI safety people seem crazy, uninformed, or careless.
  
  If you look at the projects, notice that each is carefully scoped.
  1. ODD project is an engineering project for specifying the domain that a model should be designed for and used in.
  2. Luddite Pro project is about journalism on current misuses of generative AI.
  3. Lawyers project is about supporting creative professionals to litigate based on existing law (DMCA takedowns, item-level disclosures for EU AI Act, pre-litigation research for an EU lawsuit).
  4. CMC project is about assessing (not carrying out) possible congressional messaging campaigns on the harms / non-safety of AI
  The fourth project was on the edge for me. I had a few calls with the research lead and decided it was okay to go ahead if the RL managed to recruit applicants with expertise in policy communication (which they did!).
  
  I prefer carefully scoped projects in this area, including for the concern you raised.
  
  I’m mainly worried about this because Remmelt’s recent posting on LW really doesn’t seem like careful or well thought through communication.
  Do you mean the posts early last year about fundamental controllability limits?
  That’s totally fair – I did not do a good job at taking peoples’ perspectives into account in sharing new writings.
  
  My mistake in part was presuming that since I’m in the same community, I could have more of an open conversation about it. I was hoping to put out a bunch of interesting posts, before putting out more rigorous explainers of the argumentation. Looking back, I should have spent way more time vetting and refining every (link)post. People’s attention is limited and you want to explain it well from their perspective right off the bat.
  
  Later that year, I distilled the reasoning into a summary explanation. That got 47 upvotes on LW.
  What links here?
  - Remmelt's comment on This might be the last AI Safety Camp by Remmelt (25 Jan 2024 11:49 UTC; 4 points)
  - Remmelt's comment on This might be the last AI Safety Camp by Remmelt (LessWrong; 25 Jan 2024 13:29 UTC; 4 points)
  - peterbarnett 27 Jan 2024 2:34 UTC
    2 points
    0 ∶ 0
    Parent
    Do you mean the posts early last year about fundamental controllability limits?
    Yep, that is what I was referring to. It does seem like you’re likely to be more careful in the future, but I’m still fairly worried about advocacy done poorly. (Although, like, I also think people should be able to advocacy if they want)
    - Remmelt 27 Jan 2024 5:26 UTC
      1 point
      0 ∶ 0
      Parent
      Yep, that is what I was referring to.
      Good that you raised this concern.
      
      It does seem like you’re likely to be more careful in the future
      Yes, I am more selective now in what I put out on the forums.
      
      In part, because I am having more one-on-one calls with (established) researchers.
      I find there is much more space to clarify and paraphrase that way.
      
      On the forums, certain write-ups seem to draw dismissive comments.
      Some combination of:
      (a) is not written by a friend or big name researcher.
      (b) requires some new counterintuitive reasoning steps.
      (c) leads to some unfavoured conclusion.
      
      For any two of those, writing can be hard but still doable.
      big name writes up counterintuitive reasoning toward an unfavoured conclusion.
      unfamiliar person writes up counterintuitive reasoning toward a favoured conclusion.
      unfamiliar person writes up obvious reasoning toward an unfavoured conclusion.
      
      In my case, for most readers it looks like:
      unfamiliar person writes up counterintuitive reasoning toward an unfavoured conclusion.
      
      There are just so many ways that can go wrong. The ways I tried to pre-empt it failed.
      - Remmelt 27 Jan 2024 5:30 UTC
        1 point
        0 ∶ 0
        Parent
        The ways I tried to pre-empt it failed.
        
        Ie.
        posting a sequence with familiar concepts to make the outside researcher more known to the community
        cautioning against jumping to judgements
        clarifying why alternatives to alignment make sense
        
        Looking back: I should have just held off until I managed to write one explainer (this one) that folks in my circles did not find extremely unintuitive.