I think your āvibeā is skeptical and most of your writings are ones expressing skepticism but I think your object-level x-risk probabilities are fairly close to the median?, people like titotal and @Vasco Grilošø have their probabilities closer to lifelong risk of death from a lightning strike than from heart disease.
Good point, but I still think that many of my beliefs and values differ pretty dramatically from the dominant perspectives often found in EA AI x-risk circles. I think these differences in my underlying worldview should carry just as much weightāif not moreāthan whether my bottom-line estimates of x-risk align with the median estimates in the community. To elaborate:
On the values side:
Willingness to accept certain tradeoffs that are ~taboo in EA: I am comfortable with many scenarios where AI risk increases by a non-negligible amount if this accelerates AI progress. In other words, I think the potential benefits of faster progress in AI development can often outweigh the risks posed by an increase in existential risk.
Relative indifference to human disempowerment: With some caveats, I am largely comfortable with human disempowerment, and I donāt think the goal of AI governance should be to keep humans in control. To me, the preference for prioritizing human empowerment over other outcomes feels like an arbitrary form of speciesismāfavoring humans simply because we are human, rather than due to any solid moral reasoning.
On the epistemic side:
Skepticism of AI alignmentās central importance to AI x-risk: I am skeptical that AI alignment is very important for reducing x-risk from AI. My primary threat model for AI risk doesnāt center on the idea that an AI with a misaligned utility function would necessarily pose a danger. Instead, I think the key issue lies in whether agents with differing valuesābe they human or artificialāwill have incentives to cooperate and compromise peacefully or whether their environment will push them toward conflict and violence.
Doubts about the treacherous turn threat model: I believe the ātreacherous turnā threat model is significantly overrated. (For context, this model posits that an AI system could pretend to be aligned with human values until it becomes sufficiently capable to act against us without risk.) Iāll note that both Paul Christiano and Eliezer Yudkowsky have identified this as their main threat model, but it is not my primary threat model.
people like titotal and @Vasco Grilošø have their probabilities closer to lifelong risk of death from a lightning strike than from heart disease.
Right. Thanks for clarifying, Linch. I guess the probability of human extinction over the next 10 years is 10^-6, which is roughly my probability of death from a lighting strike during the same period. āthe odds of being struck by lightning in a given year are less than one in a million [I guess the odds are not much lower than this], and almost 90% of all lightning strike victims surviveā (10^-6 = 10^-6*10*(1 ā 0.9)).
I think your āvibeā is skeptical and most of your writings are ones expressing skepticism but I think your object-level x-risk probabilities are fairly close to the median?, people like titotal and @Vasco Grilošø have their probabilities closer to lifelong risk of death from a lightning strike than from heart disease.
Good point, but I still think that many of my beliefs and values differ pretty dramatically from the dominant perspectives often found in EA AI x-risk circles. I think these differences in my underlying worldview should carry just as much weightāif not moreāthan whether my bottom-line estimates of x-risk align with the median estimates in the community. To elaborate:
On the values side:
Willingness to accept certain tradeoffs that are ~taboo in EA:
I am comfortable with many scenarios where AI risk increases by a non-negligible amount if this accelerates AI progress. In other words, I think the potential benefits of faster progress in AI development can often outweigh the risks posed by an increase in existential risk.
Relative indifference to human disempowerment:
With some caveats, I am largely comfortable with human disempowerment, and I donāt think the goal of AI governance should be to keep humans in control. To me, the preference for prioritizing human empowerment over other outcomes feels like an arbitrary form of speciesismāfavoring humans simply because we are human, rather than due to any solid moral reasoning.
On the epistemic side:
Skepticism of AI alignmentās central importance to AI x-risk:
I am skeptical that AI alignment is very important for reducing x-risk from AI. My primary threat model for AI risk doesnāt center on the idea that an AI with a misaligned utility function would necessarily pose a danger. Instead, I think the key issue lies in whether agents with differing valuesābe they human or artificialāwill have incentives to cooperate and compromise peacefully or whether their environment will push them toward conflict and violence.
Doubts about the treacherous turn threat model:
I believe the ātreacherous turnā threat model is significantly overrated. (For context, this model posits that an AI system could pretend to be aligned with human values until it becomes sufficiently capable to act against us without risk.) Iāll note that both Paul Christiano and Eliezer Yudkowsky have identified this as their main threat model, but it is not my primary threat model.
Right. Thanks for clarifying, Linch. I guess the probability of human extinction over the next 10 years is 10^-6, which is roughly my probability of death from a lighting strike during the same period. āthe odds of being struck by lightning in a given year are less than one in a million [I guess the odds are not much lower than this], and almost 90% of all lightning strike victims surviveā (10^-6 = 10^-6*10*(1 ā 0.9)).