Some sections that stood out to me (turns out it’s lots of sections!):
Restraint is not radical (and subsections, especially “Extremely valuable technologies”) — great section.
Restraint is not terrorism, usually — great section. I really appreciated the list of types of work that could be part of slowing down AI and think more people should read the list.
The complicated race/anti-race — the example that started, “For example, here’s a situation which I think sounds intuitively like a you-should-race world, but where in the first model above, you should actually go as slowly as possible” was really useful (and the spreadsheet is great).
Caution is cooperative. This was an interesting argument that I had somehow not seen or thought about before: “It could be that people in control of AI capabilities would respond negatively to AI safety people pushing for slower progress. But that should be called ‘we might get punished’ not ‘we shouldn’t defect’. ‘Defection’ has moral connotations that are not due. [...] On top of all that, I worry that highlighting the narrative that wanting more cautious progress is defection is further destructive, because it makes it more likely that AI capabilities people see AI safety people as thinking of themselves as betraying AI researchers, if anyone engages in any such efforts. Which makes the efforts more aggressive. Like, if every time you see friends, you refer to it as ‘cheating on my partner’, your partner may reasonably feel hurt by your continual desire to see friends, even though the activity itself is innocuous.”
“The starkest appearance of error along these lines to me is in writing off the slowing of AI as inherently destructive of relations between the AI safety community and other AI researchers. If we grant that such activity would be seen as a betrayal (which seems unreasonable to me, but maybe), surely it could only be a betrayal if carried out by the AI safety community. There are quite a lot of people who aren’t in the AI safety community and have a stake in this, so maybe some of them could do something. It seems like a huge oversight to give up on all slowing of AI progress because you are only considering affordances available to the AI Safety Community.”
“I more weakly suspect some related mental shortcut is misshaping the discussion of arms races in general. The thought that something is a ‘race’ seems much stickier than alternatives, even if the true incentives don’t really make it a race. Like, against the laws of game theory, people sort of expect the enemy to try to believe falsehoods, because it will better contribute to their racing. And this feels like realism. The uncertain details of billions of people one barely knows about, with all manner of interests and relationships, just really wants to form itself into an ‘us’ and a ‘them’ in zero-sum battle. This is a mental shortcut that could really kill us.”
Convincing people doesn’t seem that hard — seems to hit & provide evidence for a position on a real crux. (As a side note, “[modern AI systems] are random connections jiggled in [a] gainful direction unfathomably many times, just as mysterious to their makers” is a great way to describe it.)
Cheems mindset/can’t do attitude — I hadn’t heard this named, I think (except in narrower cases like learned helplessness etc.), and intuitively agree with its application here.
I’m curating this post. (See also the comments on LessWrong.)
Some sections that stood out to me (turns out it’s lots of sections!):
Restraint is not radical (and subsections, especially “Extremely valuable technologies”) — great section.
Restraint is not terrorism, usually — great section. I really appreciated the list of types of work that could be part of slowing down AI and think more people should read the list.
The complicated race/anti-race — the example that started, “For example, here’s a situation which I think sounds intuitively like a you-should-race world, but where in the first model above, you should actually go as slowly as possible” was really useful (and the spreadsheet is great).
Caution is cooperative. This was an interesting argument that I had somehow not seen or thought about before: “It could be that people in control of AI capabilities would respond negatively to AI safety people pushing for slower progress. But that should be called ‘we might get punished’ not ‘we shouldn’t defect’. ‘Defection’ has moral connotations that are not due. [...] On top of all that, I worry that highlighting the narrative that wanting more cautious progress is defection is further destructive, because it makes it more likely that AI capabilities people see AI safety people as thinking of themselves as betraying AI researchers, if anyone engages in any such efforts. Which makes the efforts more aggressive. Like, if every time you see friends, you refer to it as ‘cheating on my partner’, your partner may reasonably feel hurt by your continual desire to see friends, even though the activity itself is innocuous.”
‘We’ are not the US, ‘we’ are not the AI safety community — I’ve had conversations related to this that were pretty confused, and really appreciate seeing this written out.
Some excerpts:
“The starkest appearance of error along these lines to me is in writing off the slowing of AI as inherently destructive of relations between the AI safety community and other AI researchers. If we grant that such activity would be seen as a betrayal (which seems unreasonable to me, but maybe), surely it could only be a betrayal if carried out by the AI safety community. There are quite a lot of people who aren’t in the AI safety community and have a stake in this, so maybe some of them could do something. It seems like a huge oversight to give up on all slowing of AI progress because you are only considering affordances available to the AI Safety Community.”
“I more weakly suspect some related mental shortcut is misshaping the discussion of arms races in general. The thought that something is a ‘race’ seems much stickier than alternatives, even if the true incentives don’t really make it a race. Like, against the laws of game theory, people sort of expect the enemy to try to believe falsehoods, because it will better contribute to their racing. And this feels like realism. The uncertain details of billions of people one barely knows about, with all manner of interests and relationships, just really wants to form itself into an ‘us’ and a ‘them’ in zero-sum battle. This is a mental shortcut that could really kill us.”
Somewhat relatedly, I think EA should taboo “EA should”
Convincing people doesn’t seem that hard — seems to hit & provide evidence for a position on a real crux. (As a side note, “[modern AI systems] are random connections jiggled in [a] gainful direction unfathomably many times, just as mysterious to their makers” is a great way to describe it.)
See also discussion on LW, e.g. this comment (and this resource)
Technological choice is not luddism—I’ve seen the argument made (or the heuristic evoked) and appreciate this note
Cheems mindset/can’t do attitude — I hadn’t heard this named, I think (except in narrower cases like learned helplessness etc.), and intuitively agree with its application here.