Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Strong upvote. This is exactly the kind of post I’d like to see more often on the Forum: It summarizes many different points of view without trying to persuade anyone, points out some core areas of agreement, and names people who seem to believe different things (perhaps opening lines for productive discussion in the process). Work like this will be critical for EA’s future intellectual progress.
I’m not sure these considerations should be too concerning in this case for a couple of reasons.
I agree that it’s concerning where “conclusions… remain the same but the reasons given for holding those conclusions change” in cases where people originally (putatively) believe p because of x, then x is shown to be a weak consideration and so they switch to citing y as a reason to believe y. But from your post it doesn’t seem like that’s necessarily what has happened, rather than a conclusion being overdetermined by multiple lines of evidence. Of course, particular people in the field may have switched between some of these reasons, having decided that some of them are not so compelling, but in the case of many of the reasons cited above, the differences between the positions seem sufficiently subtle that we should expect cases of people clarifying their own understanding by shifting to closely related positions(e.g. it seems plausible someone might reasonably switch from thinking that the main problem is knowing how to precisely describe what we value to thinking that the main problem is not knowing how to make an agent try to do that).
It also seems like a proliferation of arguments in favour of a position is not too concerning where there are plausible reasons why should expect multiple of the considerations to apply simultaneously. For example, you might think that any kind of powerful agent typically presents a threat in multiple different ways, in which case it wouldn’t be suspicious if people cited multiple distinct considerations as to why they were important.
I agree that it’s not too concerning, which is why I consider it weak evidence. Nevertheless, there are some changes which don’t fit the patterns you described. For example, it seems to me that newer AI safety researchers tend to consider intelligence explosions less likely, despite them being a key component of argument 1. For more details along these lines, check out the exchange between me and Wei Dai in the comments on the version of this post on the alignment forum.
Agreed. I think these reasons seem to fit fairly easily into the following schema: Each of A, B, C, and D is necessary for a good outcome. Different people focus on failures of A, failures of B, etc. depending on which necessary criterion seems to them most difficult to satisfy and most salient.
Hi Richard, really interesting! However I think all your 6 reasons still think of AGI as being an independent agent. What do you think of this https://www.fhi.ox.ac.uk/reframing/ by Drexler—AGI as a comprehensive set of services? To me this makes the problem much more tractable and better aligns with how we see things actually progressing.
Drexler would disagree with some of Richard’s phrasing, but he seems to agree that most (possibly all) of (somewhat modified versions of) those 6 reasons should cause us to be somewhat worried. In particular, he’s pretty clear that powerful utility maximisers are possible and would be dangerous.
Yes—we have increasingly powerful utility maximisers already and they are in many applications increasingly dangerous.
I think 4, 5 and 6 are all valid even if you take the CAIS view. Could you explain how you think those depend on the AGI being an independent agent?
Plausibly 2 and 3 also apply to CAIS, though those are more ambiguous.
6 describes the AGI as a “species”—services are not a species, agents are a species. 4 and 5 as written describe the AGI as an agent—surely once the AGI is described as an “it” that is doing something certainly sounds like an independent agent to me. A service and an agent are fundamentally different in nature, they are not just a different view, as the outcome would depend on the objectives of the instructing agent.
I’ve actually spent a fair while thinking about CAIS, and written up my thoughts here. Overall I’m skeptical about the framework, but if it turns out to be accurate I think that would heavily mitigate arguments 1 and 2, somewhat mitigate 3, and not affect the others very much. Insofar as 4 and 5 describe AGI as an agent, that’s mostly because it’s linguistically natural to do so—I’ve now edited some of those phrases. 6b does describe AI as a species, but it’s unclear whether that conflicts with CAIS, insofar as the claim that AI will never be agentlike is a very strong one, and I’m not sure whether Drexler makes it explicitly (I discuss this point in the blog post I linked above).
“Skeptical about the framework” I do not agree with. Indeed it seems a useful model for how we as humans are. We become expert to varying degrees at a range of tasks or services through training—as we get in a car we turn on our “driving services” module (and sub modules) for example. And then underlying and separately we have our unconscious which drives the majority of our motivations as a “free agent”—our mammalian brain—which drives our socialising and norming actions, and then underneath that our limbic brain which deals with emotions like fear and status which in my experience are the things that “move the money” if they are encouraged.
It does not seem to me we are particularly “generally intelligent”. Put in a completely unfamiliar setting without all the tools that now prop us up, we will struggle far more than a species already familiar in that environment.
The intelligent agent approach to me takes the debate in the wrong direction, and most concerningly dramatically understates the near and present danger of utility maximising services (“this is not superintelligence”), such as this example discussed by Yuval Noah Harari and Tristan Harris.
https://www.youtube.com/watch?v=v0sWeLZ8PXg
I think this is a good comment about how the brain works, but do remember that the human brain can both hunt in packs and do physics. Most systems you might build to hunt are not able to do physics, and vice versa. We’re not perfectly competent, but we’re still general.
I agree that the extent to which individual humans are rational agents is often overstated. Nevertheless, there are many examples of humans who spend decades striving towards distant and abstract goals, who learn whatever skills and perform whatever tasks are required to reach them, and who strategically plan around or manipulate the actions of other people. If AGI is anywhere near as agentlike as humans in the sense of possessing the long-term goal-directedness I just described, that’s cause for significant concern.
A lifetime learning to be a 9th Dan master at go perhaps? Building on the back of thousands of years of human knowledge and wisdom? Demolished in hours.… I still look at the game and it looks incredibly abstract!!
Don’t get my wrong I am really concerned, I just consider the danger much closer than others, but also more soluble if we look at the right problem and ask the right questions.