I see! Thanks for the clarification. It’s a fascinating argument if I’m understanding it correctly now: it could be worth substantially increasing our risk of extinction if we more substantially increased our odds of capturing more of the potential value in our light cone.
I’m not a dedicated utilitarian, so I typically tend to value futures with some human flourishing and little suffering vastly higher than futures with no sentient beings. But I am actually convinced that we should tilt a little toward futures with more flourishing.
Aligning AGI seems like the crux for both survival and flourishing (and aligning society, in the likely case that “aligned” AGI is intent-aligned to take orders from individuals). But there will be small changes in strategy that emphasize flourishing vs mere survival futures, and I’ll lean toward those based on this discussion, because outside of myself and my loved ones, my preferences become largely utilitarian.
It should also be born in mind that creating misaligned AGI runs a pretty big risk of wiping out not just us but any other sentient species in the lightcone.
I don’t have a nice clean citation. I don’t think one exists. I’ve looked at an awful lot of individual opinions and different surveys. I guess the biggest reason I’m convinced this correlation exists is that arguments for low p(doom) very rarely actually engage arguments for risk at their strong points (when they do the discussions are inconclusive in both directions—I’m not arguing that alignment is hard, but that it’s very much unknown how hard it is).
There appears to be a very high correlation between misunderstanding the state of play, and optimism. And because it’s a very complex state of arguments, the vast majority of the world misunderstands it pretty severely.
I very much wish it was otherwise; I am an optimist who has become steadily more pessimistic as I’ve made alignment my full-time focus—because the arguments against are subtle (and often poorly communicated) but strong.
They arguments for the difficulty of alignment are far too strong to be rationally dismissed down to the 1.4% or whatever it was that the superforecasters arrived at. They have very clearly missed some important points of argument.
The anticorrelation with academic success seems quite right and utterly irrelevant. As a career academic, I have been noticing for decades that academic success has some quite perverse incentives.
I agree that there are bad arguments for pessimism as well as optimism. The use of bad logic in some prominent arguments says nothing about the strength of other arguments. Arguments on both sides are far from conclusive. So you can hope arguments for the fundamental difficulty of aligning network-based AGI are wrong, but assigning a high probability they’re wrong without understanding them in detail and constructing valid counterarguments is tempting but not rational.
If there’s a counterargument you find convincing, please point me to it! Because while I’m arguing from the outside view, my real argument is that this is an issue that is unique in intellectual history, so it can really only be evaluated from the inside view. So that’s where most of my thoughts on the matter go.
All of which isn’t to say the doomers are right and we’re doomed if we don’t stop building network-based AGI. I’m saying we don’t know. I’m arguing that assigning a high probability right now based on limited knowledge to humanity accomplishing alignment is not rationally justified.
I think that fact is reflected in the correlation of p(doom) with time-on-task only on alignment specifically. If that’s wrong I’d be shocked, because it looks very strong to me, and I do work hard to correct for my own biases. But it’s possible I’m wrong about this correlation. If so it will make my day and perhaps my month or year!
It is ultimately a question that needs to be resolved at the object level; we just need to take guesses about how to assign resources based on outside views.