Ideas for AI labs: Reading list

Related: AI policy ideas: Reading list.

This document is about ideas for AI labs. It’s mostly from an x-risk perspective. Its underlying organization black-boxes technical AI stuff, including technical AI safety.

Lists & discussion

Levers

Desiderata

Maybe I should make a separate post on desiderata for labs (for existential safety).

Ideas

Coordination[1]

See generally The Role of Cooperation in Responsible AI Development (Askell et al. 2019).

Transparency

Transparency enables coordination (and some regulation).

Publication practices

Labs should minimize/​delay the diffusion of their capabilities research.

Structured access to AI models

Governance structure

Miscellanea

See also


Some sources are roughly sorted within sections by a combination of x-risk-relevance, quality, and influentialness– but sometimes I didn’t bother to try to sort them, and I haven’t read all of them.

Please have a low bar to suggest additions, substitutions, rearrangements, etc.

Current as of: 9 July 2023.

  1. ^

    At various levels of abstraction, coordination can look like:
    - Avoiding a race to the bottom
    - Internalizing some externalities
    - Sharing some benefits and risks
    - Differentially advancing more prosocial actors?
    - More?

  2. ^

    Policymaking in the Pause (FLI 2023) cites A Systematic Review on Model Watermarking for Neural Networks (Boenisch 2021); I don’t know if that source is good. (Note: this disclaimer does not imply that I know that the other sources in this doc are good!)

    I am not excited about watermarking. (Note: this disclaimer does not imply that I am excited about the other ideas in this doc! But I am excited about most of them.)

Crossposted from LessWrong (11 points, 0 comments)