Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
An example of a possible “pivotal act” I like that isn’t “melt all GPUs” is:
Looking for pivotal acts that are less destructive (and, more importantly for humanity’s sake, less difficult to align) than “melt all GPUs” seems like a worthy endeavor to me. But I prefer the framing ‘let’s discuss the larger space of pivotal acts, brainstorm new ideas, and try to find options that are easier to achieve, because that particular toy proposal seems suboptimally dangerous and there just hasn’t been very much serious analysis and debate about pathways’. In the course of that search, if it then turns out that the most likely-to-succeed option is a process, then we should obviously go with a process.
But I don’t like constraining that search to ‘processes only, not acts’, because:
(a) I’m guessing something more local, discrete, and act-like will be necessary, even if it’s less extreme than “melt all GPUs”;
(b) insofar as I’m uncertain about which paths will be viable and think the problem is already extremely hard and extremely constrained, I don’t want to further narrow the space of options that humanity can consider and reason through;
(c) I worry that the “processes” framing will encourage more Rube-Goldberg-machine-like proposals, where the many added steps and layers and actors obscure the core world-saving cognition and action, making it harder to spot flaws and compare tradeoffs;
and (d) I worry that the extra steps, layers, and actors will encourage “design by committee” and slow-downs that doom otherwise-promising projects.
I suspect we also have different intuitions about pivotal acts because we have different high-level pictures of the world’s situation.
I think that humanity as it exists today is very far off from thinking like a serious civilization would about these issues. As a consequence, our current trajectory has a negligible chance of producing good long-run outcomes. Rather than trying to slightly nudge the status quo toward marginally better thinking, we have more hope if we adopt a heuristic like speak candidly and realistically about things, as though we lived on the Earth that does take these issues seriously, and hope that this seriousness and sanity might be infectious.
On my model, we don’t have much hope if we continue to half-say-the-truth, and continue to make small steady marginal gains, and continue to talk around the hard parts of the problem; but we do have the potential within us to just drop the act and start fully sharing our models and being real with each other, including being real about the parts where there will be harsh disagreements.
I think that a large part of the reason humanity is currently endangering itself is that everyone is too focused on ‘what’s in the Overton window?’, and is too much trying to finesse each other’s models and attitudes, rather than blurting out their actual views and accepting the consequences.
This makes the situation I described in The inordinately slow spread of good AGI conversations in ML much stickier: very little of the high-quality / informed public discussion of AGI is candid and honest, and people notice this, so updating and epistemic convergence is a lot harder; and everyone is dissembling in the same direction, toward ‘be more normal’, ‘treat AGI more like business-as-usual’, ‘pretend that the future is more like the past’.
All of this would make me less eager to lean into proposals like “yes, let’s rush into establishing a norm that large parts of the strategy space are villainous and not to be talked about” even if I agreed that pivotal processes are a better path to long-run good outcomes than pivotal acts. This is inviting even more of the central problem with current discourse, which is that people don’t feel comfortable even talking about their actual views.
You may not think that a pivotal act is necessary, but there are many who disagree with you. Of those, I would guess that most aren’t currently willing to discuss their thoughts, out of fear that the resultant discussion will toss norms of scholarly discussion out the window. This seems bad to me, and not like the right direction for a civilization to move into if it’s trying to emulate ‘the kind of civilization that handles AGI successfully’. I would rather a world where humanity’s best and brightest were debating this seriously, doing scenario analysis, assigning probabilities and considering specific mainline and fallback plans, etc., over one where we prejudge ‘discrete pivotal acts definitely won’t be necessary’ and decide at the outset to roll over and die if it does turn out that pivotal acts are necessary.
My alternative proposal would be: Let’s do scholarship at the problem, discuss it seriously, and not let this topic be ruled by ‘what is the optimal social-media soundbite?’.
If the best idea sounds bad in soundbite form, then let’s have non-soundbite-length conversations about it. It’s an important enough topic, and a complex enough one, that this would IMO be a no-brainer in a world well-equipped to handle developments like AGI.
We should distinguish “safer” in the sense of “less likely to cause a bad outcome” from “safer” in the sense of “less likely to be followed by a bad outcome”.
E.g., the FDA banning COVID-19 testing in the US in the early days of the pandemic was “safer” in the narrow sense that they legitimately reduced the risk that COVID-19 tests would cause harm. But the absence of testing resulted in much more harm, and was “unsafe” in that sense.
Similarly: I’m mildly skeptical that humanity refusing to attempt any pivotal acts makes us safer from the particular projects that enact this norm. But I’m much more skeptical that humanity refusing to attempt any pivotal acts makes us safer from harm in general. These two versions of “safer” need to be distinguished and argued for separately.
Any proposal that adds red tape, inefficiencies, slow-downs, process failures, etc. will make AGI projects “safer” in the first sense, inasmuch as it cripples the project or slows it down to the point of irrelevance.
As someone who worries that timelines are probably way too short for us to solve enough of the “pre-AGI alignment prerequisites” to have a shot at aligned AGI, I’m a big fan of sane, non-adversarial ideas that slow down the field’s AGI progress today.
But from my perspective, the situation is completely reversed when you’re talking about slowing down a particular project’s progress when they’re actually building, aligning, and deploying their AGI.
At some point, a group will figure out how to build AGI. When that happens, I expect an AGI system to destroy the world within just a few years, if no pivotal act or processes finishes occurring first. And I expect safety-conscious projects to be at a major speed disadvantage relative to less safety-conscious projects.
Adding any unnecessary steps to the process—anything that further slows down the most safety-conscious groups—seems like suicide to me, insofar as it either increases the probability that the project fails to produce a pivotal outcome in time, or increases the probability that the project cuts more corners on safety because it knows that it has that much less time.
I obviously don’t want the first AGI projects to rush into a half-baked plan and destroy the world. First and foremost, do not destroy the world by your own hands, or commit the fallacy of “something must be done, and this is something!”.
But I feel more worried about AGI projects insofar as they don’t have a lot of time to carefully align their systems (so I’m extremely reluctant to tack on any extra hurdles that might slow them down and that aren’t crucial for alignment), and also more worried insofar as they haven’t carefully thought about stuff like this in advance. (Because I think a pivotal act is very likely to be necessary, and I think disaster is a lot more likely if people don’t feel like they can talk candidly about it, and doubly so if they’re rushing into a plan like this at the last minute rather than having spent decades prior carefully thinking about and discussing it.)