Is your claim just that people should generally “increase [their] error bars and widen [their] probability distribution”? (I was frustrated by the difficulty of figuring out what this post is actually claiming; it seems like it would benefit from a “I make the following X major claims...” TLDR.)
I probably disagree with your points about empiricism vs. rationalism (on priors that I dislike the way most people approach the two concepts), but I think I agree that most people should substantially widen their “error bars” and be receptive to new information. And it’s for precisely that reason which I feel decently confident in saying “most people whose risk estimates are very low (<0.5%) are significantly overconfident.” You logically cannot have extremely low probability estimates while also believing “there’s a >10% chance that in the future I will justifiably think there is a >5% chance of doom, but right now the evidence tells me the risk is <0.5%.”
Thanks for the feedback! definitely a helpful question. That error bars answer was aimed at OpenPhil based on what I’ve read from them on AI risk + the prompt in their essay question. I’m sure many others are capable of answering the “what is the probability” forecasting question better/more directly than me, but my two cents was to step back and question underlying assumptions about forecasting that seem common in these conversations.
Hume wrote that “all probable reasoning is nothing but a species of sensation.” This doesn’t mean we should avoid probable reasoning (we can’t) but I think we should recognize it is based only on our experiences/observations of the world. and question how rational its foundations are. I don’t think at this stage anyone actually has the empirical basis to give a meaningful % for “AI will kill everyone.” Call it .5 or 1 or 7 or whatever but my essay is about trying to take a step back and question epistemological foundations. Anthropic seems much better at this so far (if they mean it that they’d stop given further empirical evidence of risks).
I did list two premises from Hume that I think are true (or truer than the average person concerned about AI x-risk holds them to be), so those were my TLDR I guess also.
I see. (For others’ reference, those two points are pasted below)
All knowledge is derived from impressions of the external world. Our ability to reason is limited, particularly about ideas of cause and effect with limited empirical experience.
History shows that societies develop in an emergent process, evolving like an organism into an unknown and unknowable future. History was shaped less by far-seeing individuals informed by reason than by contexts which were far too complex to realize at the time.
Overall, I don’t really know what to make of these. They are fairly vague statements, making them very liable to motte-and-bailey interpretations; they border on deepities, in my reading.
“All knowledge is derived from impressions of the external world” might be true in a trivially obvious sense that you often need at least some iota of external information to develop accurate beliefs or effective actions (although even this might be somewhat untrue with regard to biological instincts). However, it makes no clear claim about how much and what kind of “impressions from the external world” are necessary for “knowledge.”[1] Insofar as the claim is that forecasts about AI x-risks are not “derived from impressions of the external world,” I think this is completely untrue. In such an interpretation, I question whether the principle even lives up to its own claims: what empirical evidence was this claim derived from?
The second claim suffers from similar problems in my view: I obviously wouldn’t claim that there have always been seers who could just divine the long-run future. However, insofar as it is saying that the future is so “unknowable” that people cannot reason about what actions in front of them are good, I also reject this: it seems obviously untrue with regards to, e.g., fighting Nazi Germany in WW2. Moreover, I would say that even if this has been true, that does not mean it will always be true, especially given the potential for value lock-in from superintelligent AI.
Overall, I agree that it’s important to be humble about our forecasts and that we should be actively searching for more information and methods to improve our accuracy, questioning our biases, etc. But I also don’t trust vague statements that could be interpreted as saying it’s largely hopeless to make decision-informing predictions about what to do in the short term to increase the chance of making the long-run future go well.
Thanks for the feedback. I agree that trying to present an alternative worldview ends up quite broad with some good counter examples. And I certainly didn’t want to give this impression:
it’s largely hopeless to make decision-informing predictions about what to do in the short term to increase the chance of making the long-run future go well.
Instead I’d say that it is difficult to make these predictions based on a priori reasoning, which this community often tries for AI, and that we should shift resources towards rigorous empirical evidence to better inform our predictions. I tried to give specific examples- Anthropic style alignment research is empiricist, Yudkowsky style theorizing is a priori rationalist. This sort of epistemological critique of longtermism is somewhatcommon.
Ultimately, I’ve found that the line between empirical and theoretical analysis is often very blurry, and if someone does develop a decent brightline to distinguish the two, it turns out that there are often still plenty of valuable theoretical methods, and some of the empirical methods can be very misleading.
For example, high-fidelity simulations are arguably theoretical under most definitions, but they can be far more accurate than empirical tests.
Is your claim just that people should generally “increase [their] error bars and widen [their] probability distribution”? (I was frustrated by the difficulty of figuring out what this post is actually claiming; it seems like it would benefit from a “I make the following X major claims...” TLDR.)
I probably disagree with your points about empiricism vs. rationalism (on priors that I dislike the way most people approach the two concepts), but I think I agree that most people should substantially widen their “error bars” and be receptive to new information. And it’s for precisely that reason which I feel decently confident in saying “most people whose risk estimates are very low (<0.5%) are significantly overconfident.” You logically cannot have extremely low probability estimates while also believing “there’s a >10% chance that in the future I will justifiably think there is a >5% chance of doom, but right now the evidence tells me the risk is <0.5%.”
Thanks for the feedback! definitely a helpful question. That error bars answer was aimed at OpenPhil based on what I’ve read from them on AI risk + the prompt in their essay question. I’m sure many others are capable of answering the “what is the probability” forecasting question better/more directly than me, but my two cents was to step back and question underlying assumptions about forecasting that seem common in these conversations.
Hume wrote that “all probable reasoning is nothing but a species of sensation.” This doesn’t mean we should avoid probable reasoning (we can’t) but I think we should recognize it is based only on our experiences/observations of the world. and question how rational its foundations are. I don’t think at this stage anyone actually has the empirical basis to give a meaningful % for “AI will kill everyone.” Call it .5 or 1 or 7 or whatever but my essay is about trying to take a step back and question epistemological foundations. Anthropic seems much better at this so far (if they mean it that they’d stop given further empirical evidence of risks).
I did list two premises from Hume that I think are true (or truer than the average person concerned about AI x-risk holds them to be), so those were my TLDR I guess also.
I see. (For others’ reference, those two points are pasted below)
Overall, I don’t really know what to make of these. They are fairly vague statements, making them very liable to motte-and-bailey interpretations; they border on deepities, in my reading.
“All knowledge is derived from impressions of the external world” might be true in a trivially obvious sense that you often need at least some iota of external information to develop accurate beliefs or effective actions (although even this might be somewhat untrue with regard to biological instincts). However, it makes no clear claim about how much and what kind of “impressions from the external world” are necessary for “knowledge.”[1] Insofar as the claim is that forecasts about AI x-risks are not “derived from impressions of the external world,” I think this is completely untrue. In such an interpretation, I question whether the principle even lives up to its own claims: what empirical evidence was this claim derived from?
The second claim suffers from similar problems in my view: I obviously wouldn’t claim that there have always been seers who could just divine the long-run future. However, insofar as it is saying that the future is so “unknowable” that people cannot reason about what actions in front of them are good, I also reject this: it seems obviously untrue with regards to, e.g., fighting Nazi Germany in WW2. Moreover, I would say that even if this has been true, that does not mean it will always be true, especially given the potential for value lock-in from superintelligent AI.
Overall, I agree that it’s important to be humble about our forecasts and that we should be actively searching for more information and methods to improve our accuracy, questioning our biases, etc. But I also don’t trust vague statements that could be interpreted as saying it’s largely hopeless to make decision-informing predictions about what to do in the short term to increase the chance of making the long-run future go well.
A term I generally dislike for its ambiguity and philosophical denotations (which IMO are often dubious at best).
Thanks for the feedback. I agree that trying to present an alternative worldview ends up quite broad with some good counter examples. And I certainly didn’t want to give this impression:
Instead I’d say that it is difficult to make these predictions based on a priori reasoning, which this community often tries for AI, and that we should shift resources towards rigorous empirical evidence to better inform our predictions. I tried to give specific examples- Anthropic style alignment research is empiricist, Yudkowsky style theorizing is a priori rationalist. This sort of epistemological critique of longtermism is somewhat common.
Ultimately, I’ve found that the line between empirical and theoretical analysis is often very blurry, and if someone does develop a decent brightline to distinguish the two, it turns out that there are often still plenty of valuable theoretical methods, and some of the empirical methods can be very misleading.
For example, high-fidelity simulations are arguably theoretical under most definitions, but they can be far more accurate than empirical tests.
Overall, I tend to be quite supportive of using whatever empirical evidence we can, especially experimental methods when they are possible, but there are many situations where we cannot do this. (I’ve written more on this here: https://georgetownsecuritystudiesreview.org/2022/11/30/complexity-demands-adaptation-two-proposals-for-facilitating-better-debate-in-international-relations-and-conflict-research/ )