I thought about why I buy the AI risk arguments despite the low base rate, and I think the reason touches on some pretty important and nontrivial concepts.
When most people encounter a complicated argument like the ones for working on AI risk, they are in a state of epistemic learned helplessness: that is, they have heard many convincing arguments of a similar form be wrong, or many convincing arguments for both sides. The fact that an argument sounds convincing fails to be much evidence that it’s true.
Epistemic learned helplessness is often good, because in real life arguments are tricky and people are taken in by false arguments. But when taken to an extreme, it becomes overly modest epistemology: the idea that you shouldn’t trust your models or reasoning just because other people whose beliefs are similar on the surface level disagree. Modest epistemology would lead you to believe that there’s a 1⁄3 chance you’re currently asleep, or that the correct religion is 31.1% likely to be Christianity, 24.9% to be Islam, and 15.2% to be Hinduism.
I think that EA does have something in common with religious fundamentalists: an orientation away from modest epistemology and towards taking weird ideas seriously. (I think the number of senior EAs who used to be religious fundamentalists or take other weird ideas seriously is well above the base rate.) So why do I think I’m justified in spending my career either doing AI safety research or field-building? Because I think the community has better epistemic processes than average.
Whether it’s through calibration, smarter people, people thinking for longer or more carefully, or more encouragement of skepticism, you have to have a thinking process that results in truth more often than average, if you want to reject modest epistemology and still believe true things. From the inside, the EA/rationalist subcommunity working on AI risk is clearly better than most millenarians (you should be well-calibrated about this claim, but you can’t just say “but what about from the outside?”—that’s modest epistemology). If I think about something for long enough, talk about it with my colleagues, post it on the EA forum, invite red-teaming, and so on, I expect to reach the correct conclusion eventually, or at least decide that the argument is too tricky and remain unsure (rather than end up being irreversibly convinced of the wrong conclusion). I’m very worried about this ceasing to be the case.
Taking weird ideas seriously is crucial for our impact: I think of there being a train to crazy town which multiplies our impact by >2x at every successive stop, has increasingly weird ideas at every stop, and at some point the weird ideas cease to be correct. Thus, good epistemics are also crucial for our impact.
I really appreciate this response, which I think understands me well. I also think it expresses some of my ideas better than I did. Kudos Thomas. I have a better appreciation of where we differ after reading it.
I’m not sure that it’s purely “how much to trust inside vs outside view,” but I think that is at least a very large share of it. I also think the point on what I would call humility (“epistemic learned helplessness”) is basically correct. All of this is by degrees, but I think I fall more to the epistemically humble end of the spectrum when compared to Thomas (judging by his reasoning). I also appreciate any time that someone brings up the train to crazy town, which I think is an excellent turn of phrase that captures an important idea.
Thanks a lot for this comment! I think delving into the topic of epistemic learned helplessness will help me learn how to form proper inside views, which is something I’ve been struggling with.
I’m very worried about this ceasing to be the case.
Are you worried just because it would be really bad if EA in the future (say 5 years) was much worse at coming to correct conclusions, or also because you think it’s likely that will happen?
Are you worried just because it would be really bad if EA in the future (say 5 years) was much worse at coming to correct conclusions, or also because you think it’s likely that will happen?
I’m not sure how likely this is but probably over 10%? I’ve heard that social movements generally get unwieldier as they get more mainstream. Also some people say this has already happened to EA, and now identify as rationalists or longtermists or something. It’s hard to form a reference class because I don’t know how much EA benefits from advantages like better organization and currently better culture.
To form proper inside views I’d also recommend reading this post, which (in addition to other things) sketches out a method for healthy deference:
I think that something like this might be a good metaphor for how you should relate to doing good in the world, or to questions like “is it good to work on AI safety”. You try to write down the structure of an argument, and then fill out the steps of the argument, breaking them into more and more fine-grained assumptions. I am enthusiastic about people knowing where the sorrys are—that is, knowing what assumptions about the world they’re making. Once you’ve written down in your argument “I believe this because Nick Bostrom says so”, you’re perfectly free to continue believing the same things as before, but at least now you’ll know more precisely what kinds of external information could change your mind.
The key event which I think does good here is when you realize that you had an additional assumption than you realized, or when you realized that you’d thought that you understood the argument for X but actually you don’t know how to persuade yourself of X given only the arguments you already have.
Thanks for writing this.
I thought about why I buy the AI risk arguments despite the low base rate, and I think the reason touches on some pretty important and nontrivial concepts.
When most people encounter a complicated argument like the ones for working on AI risk, they are in a state of epistemic learned helplessness: that is, they have heard many convincing arguments of a similar form be wrong, or many convincing arguments for both sides. The fact that an argument sounds convincing fails to be much evidence that it’s true.
Epistemic learned helplessness is often good, because in real life arguments are tricky and people are taken in by false arguments. But when taken to an extreme, it becomes overly modest epistemology: the idea that you shouldn’t trust your models or reasoning just because other people whose beliefs are similar on the surface level disagree. Modest epistemology would lead you to believe that there’s a 1⁄3 chance you’re currently asleep, or that the correct religion is 31.1% likely to be Christianity, 24.9% to be Islam, and 15.2% to be Hinduism.
I think that EA does have something in common with religious fundamentalists: an orientation away from modest epistemology and towards taking weird ideas seriously. (I think the number of senior EAs who used to be religious fundamentalists or take other weird ideas seriously is well above the base rate.) So why do I think I’m justified in spending my career either doing AI safety research or field-building? Because I think the community has better epistemic processes than average.
Whether it’s through calibration, smarter people, people thinking for longer or more carefully, or more encouragement of skepticism, you have to have a thinking process that results in truth more often than average, if you want to reject modest epistemology and still believe true things. From the inside, the EA/rationalist subcommunity working on AI risk is clearly better than most millenarians (you should be well-calibrated about this claim, but you can’t just say “but what about from the outside?”—that’s modest epistemology). If I think about something for long enough, talk about it with my colleagues, post it on the EA forum, invite red-teaming, and so on, I expect to reach the correct conclusion eventually, or at least decide that the argument is too tricky and remain unsure (rather than end up being irreversibly convinced of the wrong conclusion). I’m very worried about this ceasing to be the case.
Taking weird ideas seriously is crucial for our impact: I think of there being a train to crazy town which multiplies our impact by >2x at every successive stop, has increasingly weird ideas at every stop, and at some point the weird ideas cease to be correct. Thus, good epistemics are also crucial for our impact.
I really appreciate this response, which I think understands me well. I also think it expresses some of my ideas better than I did. Kudos Thomas. I have a better appreciation of where we differ after reading it.
I’m curious on what exactly you see your opinions as differing here. Is it just how much to trust inside vs outside view, or something else?
I’m not sure that it’s purely “how much to trust inside vs outside view,” but I think that is at least a very large share of it. I also think the point on what I would call humility (“epistemic learned helplessness”) is basically correct. All of this is by degrees, but I think I fall more to the epistemically humble end of the spectrum when compared to Thomas (judging by his reasoning). I also appreciate any time that someone brings up the train to crazy town, which I think is an excellent turn of phrase that captures an important idea.
I really enjoyed this comment, thanks for writing it Thomas!
Thanks a lot for this comment! I think delving into the topic of epistemic learned helplessness will help me learn how to form proper inside views, which is something I’ve been struggling with.
Are you worried just because it would be really bad if EA in the future (say 5 years) was much worse at coming to correct conclusions, or also because you think it’s likely that will happen?
I’m not sure how likely this is but probably over 10%? I’ve heard that social movements generally get unwieldier as they get more mainstream. Also some people say this has already happened to EA, and now identify as rationalists or longtermists or something. It’s hard to form a reference class because I don’t know how much EA benefits from advantages like better organization and currently better culture.
To form proper inside views I’d also recommend reading this post, which (in addition to other things) sketches out a method for healthy deference: