If you’ve got a very high probability of AI takeover (obligatory reference!), then my first two arguments, at least, might seem very weak because essentially the only thing that matters is reducing the risk of AI takeover.
I do think the risk of AI takeover is much higher than you do, but I don’t think that’s why I disagree with the arguments for more heavily prioritizing the list of (example) cause areas that you outline. Rather, it’s a belief that’s slightly upstream of my concerns about takeover risk—that the advent of ASI almost necessarily[1] implies that we will no longer have our hands on the wheel, so to speak, whether for good or ill.
An unfortunate consequence of having beliefs like I do about what a future with ASI in it involves is that those beliefs are pretty totalizing. They do suggest that “making the transition to a post-ASI world go well” is of paramount importance (putting aside questions of takeover risk). They do not suggest that it would be useful for me to think about most of the listed examples, except insofar as they feed into somehow getting a friendly ASI rather than something else. There are some exceptions: for example, if you have much lower odds of AI takeover than I do, but still expect ASI to have this kind of totalizing effect on the future, I claim you should find it valuable for some people to work on “animal welfare post-ASI”, and whether there is anything that can meaningfully be done pre-ASI to reduce the risk of animal torture continuing into the far future[2]. But many of the other listed concerns seem very unlikely to matter post-ASI, and I don’t get the impression that you think we should be working on AI character or preserving democracy as instrumental paths by which we reduce the risk of AI takeover, bad/mediocre value lock-in, etc, but because you consider things like that to be important separate from traditional “AI risk” concerns. Perhaps I’m misunderstanding?
It is perhaps not a coincidence that I expect this work to initially look like “do philosophy”, i.e. trying to figure out whether traditional proposals like CEV would permit extremely bad outcomes, looking for better alternatives, etc.
I’m not sure I understood the last sentence. I personally think that a bunch of areas Will mentioned (democracy, persuasion, human + AI coups) are extremely important, and likely more useful on the margin than additional alignment/control/safety work for navigating the intelligence explosion. I’m probably a bit less “aligned ASI is literally all that matters for making the future go well” pilled than you, but it’s definitely a big part of it.
I also don’t think that having higher odds of AI x-risk are a crux, though different “shapes” of intelligence explosion could be , e.g. if you think we’ll never get useful work for coordination/alignment/defense/ai strategy pre-foom then I’d be more compelled by the totalising alignment view—but I do think that’s misguided.
I’m probably a bit less “aligned ASI is literally all that matters for making the future go well” pilled than you, but it’s definitely a big part of it.
Sure, but the vibe I get from this post is that Will believes in that a lot less than me, and the reasons he cares about those things don’t primarily route through the totalizing view of ASI’s future impact. Again, I could be wrong or confused about Will’s beliefs here, but I have a hard time squaring the way this post is written with the idea that he intended to communicate that people should work on those things because they’re the best ways to marginally improve our odds of getting an aligned ASI. Part of this is the list of things he chose, part of it is the framing of them as being distinct cause areas from “AI safety”—from my perspective, many of those areas already have at least a few people working on them under the label of “AI safety”/”AI x-risk reduction”.
Like, Lightcone has previously and continues to work on “AI for better reasoning, decision-making and coordination”. I can’t claim to speak for the entire org but when I’m doing that kind of work, I’m not trying to move the needle on how good the world ends up being conditional on us making it through, but on how likely we are to make it through at all. I don’t have that much probability mass on “we lose >10% but less than 99.99% of value in the lightcone”[1].
Edit: a brief discussion with Drake Thomas convinced me that 99.99% is probably a pretty crazy bound to have; let’s say 90%. Wqueezing out that extra 10% involves work that you’d probably describe as “macrostrategy”, but that’s a pretty broad label.
I don’t understand why you think some work on animal wefare post-ASI looks valuable, but not (e.g.) digital minds post-ASI and s-risks post-ASI. To me, it looks like working on these causes (and others?) have similar upsides (scale, neglectedness) and downsides (low tractability if ASI changes everything) to working on animal welfare post-ASI. Could you clarify why they’re different?
I do think the risk of AI takeover is much higher than you do, but I don’t think that’s why I disagree with the arguments for more heavily prioritizing the list of (example) cause areas that you outline. Rather, it’s a belief that’s slightly upstream of my concerns about takeover risk—that the advent of ASI almost necessarily[1] implies that we will no longer have our hands on the wheel, so to speak, whether for good or ill.
An unfortunate consequence of having beliefs like I do about what a future with ASI in it involves is that those beliefs are pretty totalizing. They do suggest that “making the transition to a post-ASI world go well” is of paramount importance (putting aside questions of takeover risk). They do not suggest that it would be useful for me to think about most of the listed examples, except insofar as they feed into somehow getting a friendly ASI rather than something else. There are some exceptions: for example, if you have much lower odds of AI takeover than I do, but still expect ASI to have this kind of totalizing effect on the future, I claim you should find it valuable for some people to work on “animal welfare post-ASI”, and whether there is anything that can meaningfully be done pre-ASI to reduce the risk of animal torture continuing into the far future[2]. But many of the other listed concerns seem very unlikely to matter post-ASI, and I don’t get the impression that you think we should be working on AI character or preserving democracy as instrumental paths by which we reduce the risk of AI takeover, bad/mediocre value lock-in, etc, but because you consider things like that to be important separate from traditional “AI risk” concerns. Perhaps I’m misunderstanding?
Asserted without argument, though many words have been spilled on this question in the past.
It is perhaps not a coincidence that I expect this work to initially look like “do philosophy”, i.e. trying to figure out whether traditional proposals like CEV would permit extremely bad outcomes, looking for better alternatives, etc.
I’m not sure I understood the last sentence. I personally think that a bunch of areas Will mentioned (democracy, persuasion, human + AI coups) are extremely important, and likely more useful on the margin than additional alignment/control/safety work for navigating the intelligence explosion. I’m probably a bit less “aligned ASI is literally all that matters for making the future go well” pilled than you, but it’s definitely a big part of it.
I also don’t think that having higher odds of AI x-risk are a crux, though different “shapes” of intelligence explosion could be , e.g. if you think we’ll never get useful work for coordination/alignment/defense/ai strategy pre-foom then I’d be more compelled by the totalising alignment view—but I do think that’s misguided.
Sure, but the vibe I get from this post is that Will believes in that a lot less than me, and the reasons he cares about those things don’t primarily route through the totalizing view of ASI’s future impact. Again, I could be wrong or confused about Will’s beliefs here, but I have a hard time squaring the way this post is written with the idea that he intended to communicate that people should work on those things because they’re the best ways to marginally improve our odds of getting an aligned ASI. Part of this is the list of things he chose, part of it is the framing of them as being distinct cause areas from “AI safety”—from my perspective, many of those areas already have at least a few people working on them under the label of “AI safety”/”AI x-risk reduction”.
Like, Lightcone has previously and continues to work on “AI for better reasoning, decision-making and coordination”. I can’t claim to speak for the entire org but when I’m doing that kind of work, I’m not trying to move the needle on how good the world ends up being conditional on us making it through, but on how likely we are to make it through at all. I don’t have that much probability mass on “we lose >10% but less than 99.99% of value in the lightcone”[1].
Edit: a brief discussion with Drake Thomas convinced me that 99.99% is probably a pretty crazy bound to have; let’s say 90%. Wqueezing out that extra 10% involves work that you’d probably describe as “macrostrategy”, but that’s a pretty broad label.
I haven’t considered the numbers here very carefully.
I don’t understand why you think some work on animal wefare post-ASI looks valuable, but not (e.g.) digital minds post-ASI and s-risks post-ASI. To me, it looks like working on these causes (and others?) have similar upsides (scale, neglectedness) and downsides (low tractability if ASI changes everything) to working on animal welfare post-ASI. Could you clarify why they’re different?
I don’t think that—animal welfare post-ASI is a subset of “s-risks post-ASI”.