As someone who leans on the x-risk-skeptical side, especially regarding AI, I’ll offer my anecdote that I don’t think my views have been unfairly maligned or censored much.
I do think my arguments have largely been ignored, which is unfortunate. But I don’t personally feel the “massive social pressure” that titotal alluded to above, at least in a strong sense.
I think your “vibe” is skeptical and most of your writings are ones expressing skepticism but I think your object-level x-risk probabilities are fairly close to the median?, people like titotal and @Vasco Grilo🔸 have their probabilities closer to lifelong risk of death from a lightning strike than from heart disease.
Good point, but I still think that many of my beliefs and values differ pretty dramatically from the dominant perspectives often found in EA AI x-risk circles. I think these differences in my underlying worldview should carry just as much weight—if not more—than whether my bottom-line estimates of x-risk align with the median estimates in the community. To elaborate:
On the values side:
Willingness to accept certain tradeoffs that are ~taboo in EA: I am comfortable with many scenarios where AI risk increases by a non-negligible amount if this accelerates AI progress. In other words, I think the potential benefits of faster progress in AI development can often outweigh the risks posed by an increase in existential risk.
Relative indifference to human disempowerment: With some caveats, I am largely comfortable with human disempowerment, and I don’t think the goal of AI governance should be to keep humans in control. To me, the preference for prioritizing human empowerment over other outcomes feels like an arbitrary form of speciesism—favoring humans simply because we are human, rather than due to any solid moral reasoning.
On the epistemic side:
Skepticism of AI alignment’s central importance to AI x-risk: I am skeptical that AI alignment is very important for reducing x-risk from AI. My primary threat model for AI risk doesn’t center on the idea that an AI with a misaligned utility function would necessarily pose a danger. Instead, I think the key issue lies in whether agents with differing values—be they human or artificial—will have incentives to cooperate and compromise peacefully or whether their environment will push them toward conflict and violence.
Doubts about the treacherous turn threat model: I believe the “treacherous turn” threat model is significantly overrated. (For context, this model posits that an AI system could pretend to be aligned with human values until it becomes sufficiently capable to act against us without risk.) I’ll note that both Paul Christiano and Eliezer Yudkowsky have identified this as their main threat model, but it is not my primary threat model.
people like titotal and @Vasco Grilo🔸 have their probabilities closer to lifelong risk of death from a lightning strike than from heart disease.
Right. Thanks for clarifying, Linch. I guess the probability of human extinction over the next 10 years is 10^-6, which is roughly my probability of death from a lighting strike during the same period. “the odds of being struck by lightning in a given year are less than one in a million [I guess the odds are not much lower than this], and almost 90% of all lightning strike victims survive” (10^-6 = 10^-6*10*(1 − 0.9)).
As someone who leans on the x-risk-skeptical side, especially regarding AI, I’ll offer my anecdote that I don’t think my views have been unfairly maligned or censored much.
I do think my arguments have largely been ignored, which is unfortunate. But I don’t personally feel the “massive social pressure” that titotal alluded to above, at least in a strong sense.
I think your “vibe” is skeptical and most of your writings are ones expressing skepticism but I think your object-level x-risk probabilities are fairly close to the median?, people like titotal and @Vasco Grilo🔸 have their probabilities closer to lifelong risk of death from a lightning strike than from heart disease.
Good point, but I still think that many of my beliefs and values differ pretty dramatically from the dominant perspectives often found in EA AI x-risk circles. I think these differences in my underlying worldview should carry just as much weight—if not more—than whether my bottom-line estimates of x-risk align with the median estimates in the community. To elaborate:
On the values side:
Willingness to accept certain tradeoffs that are ~taboo in EA:
I am comfortable with many scenarios where AI risk increases by a non-negligible amount if this accelerates AI progress. In other words, I think the potential benefits of faster progress in AI development can often outweigh the risks posed by an increase in existential risk.
Relative indifference to human disempowerment:
With some caveats, I am largely comfortable with human disempowerment, and I don’t think the goal of AI governance should be to keep humans in control. To me, the preference for prioritizing human empowerment over other outcomes feels like an arbitrary form of speciesism—favoring humans simply because we are human, rather than due to any solid moral reasoning.
On the epistemic side:
Skepticism of AI alignment’s central importance to AI x-risk:
I am skeptical that AI alignment is very important for reducing x-risk from AI. My primary threat model for AI risk doesn’t center on the idea that an AI with a misaligned utility function would necessarily pose a danger. Instead, I think the key issue lies in whether agents with differing values—be they human or artificial—will have incentives to cooperate and compromise peacefully or whether their environment will push them toward conflict and violence.
Doubts about the treacherous turn threat model:
I believe the “treacherous turn” threat model is significantly overrated. (For context, this model posits that an AI system could pretend to be aligned with human values until it becomes sufficiently capable to act against us without risk.) I’ll note that both Paul Christiano and Eliezer Yudkowsky have identified this as their main threat model, but it is not my primary threat model.
Right. Thanks for clarifying, Linch. I guess the probability of human extinction over the next 10 years is 10^-6, which is roughly my probability of death from a lighting strike during the same period. “the odds of being struck by lightning in a given year are less than one in a million [I guess the odds are not much lower than this], and almost 90% of all lightning strike victims survive” (10^-6 = 10^-6*10*(1 − 0.9)).