You were arguing above that the difference between your and Eliezer’s views makes much more than a 2x difference;
I was arguing that EV estimates have more than a 2x difference; I think this is pretty irrelevant to the deference model you’re suggesting (which I didn’t know you were suggesting at the time).
do you now agree that, on my account of deference, a big change in the deference-weight you assign to Eliezer plausibly leads to a much smaller change in your policy from the perspective of other worldviews, because the Eliezer-worldview trades off influence over most parts of the policy for influence over the parts that the Eliezer-worldview thinks are crucial and other policies don’t?
No, I don’t agree with that. It seems like all the worldviews are going to want resources (money / time) and access to that is ~zero-sum. (All the worldviews want “get more resources” so I’m assuming you’re already doing that as much as possible.) The bargaining helps you avoid wasting resources on counterproductive fighting between worldviews, it doesn’t change the amount of resources each worldview gets to spend.
Going from allocating 10% of your resources to 20% of your resources to a worldview seems like a big change. It’s a big difference if you start with twice as much money / time as you otherwise would have, unless there just happens to be a sharp drop in marginal utility of resources between those two points for some reason.
Maybe you think that there are lots of things one could do that have way more effect than “redirecting 10% of one’s resources” and so it’s not a big deal? If so can you give examples?
I think calibrated credences are badly-correlated with expected future impact
I agree overconfidence is common and you shouldn’t literally calculate a Brier score to figure out who to defer to.
I agree that directionally-correct beliefs are better correlated than calibrated credences.
When I say “evaluate beliefs” I mean “look at stated beliefs and see how reasonable they look overall, taking into account what other people thought when the beliefs were stated” and not “calculate a Brier score”; I think this post is obviously closer to the former than the latter.
I agree that people’s other goals make it harder to evaluate what their “true beliefs” are, and that’s one of the reasons I say it’s only 3⁄10 correlation.
I think coherence is very well-correlated with expected future impact (like, 5⁄10), because impact is heavy-tailed and the biggest sources of impact often require strong, coherent views. I don’t think it’s that hard to evaluate in hindsight, because the more coherent a view is, the more easily it’s falsified by history.
Re: correlation, I was implicitly also asking the question “how much does this vary across experts”. Across the general population, maybe coherence is 7⁄10 correlated with expected future impact; across the experts that one would consider deferring to I think it is more like 2⁄10, because most experts seem pretty coherent (within the domains they’re thinking about and trying to influence) and so the differences in impact depend on other factors.
Re: evaluation, it seems way more common to me that there are multiple strong, coherent, conflicting views that all seem compelling (see epistemic learned helplessness), which do not seem to have been easily falsified by history (in sufficiently obvious manner that everyone agrees which one is false).
This too is in large part because we’re looking at experts in particular. I think we’re good at selecting for “enough coherence” before we consider someone an expert (if anything I think we do it too much in the “public intellectual” space), and so evaluating coherence well enough to find differences between experts ends up being pretty hard.
I think “hypothetical impact of past policies” is not that hard to evaluate. E.g. in Eliezer’s case the main impact is “people do a bunch of technical alignment work much earlier”, which I think we both agree is robustly good.
I feel like looking at any EA org’s report on estimation of their own impact makes it seem like “impact of past policies” is really difficult to evaluate?
Eliezer seems like a particularly easy case, where I agree his impact is probably net positive from getting people to do alignment work earlier, but even so I think there’s a bunch of questions that I’m uncertain about:
How bad is it that some people completely dismiss AI risk because they encountered Eliezer and found it off putting? (I’ve explicitly heard something along the lines of “that crazy stuff from Yudkowsky” from multiple ML researchers.)
How many people would be working on alignment without Eliezer’s work? (Not obviously hugely fewer, Superintelligence plausibly still gets published, Stuart Russell plausibly still goes around giving talks about value alignment and its importance.)
To what extent did Eliezer’s forceful rhetoric (as opposed to analytic argument) lead people to focus on the wrong problems?
I’ve now written up a more complete theory of deference here. I don’t expect that it directly resolves these disagreements, but hopefully it’s clearer than this thread.
Going from allocating 10% of your resources to 20% of your resources to a worldview seems like a big change.
Note that this wouldn’t actually make a big change for AI alignment, since we don’t know how to use more funding. It’d make a big change if we were talking about allocating people, but my general heuristic is that I’m most excited about people acting on strong worldviews of their own, and so I think the role of deference there should be much more limited than when it comes to money. (This all falls out of the theory I linked above.)
Across the general population, maybe coherence is 7⁄10 correlated with expected future impact; across the experts that one would consider deferring to I think it is more like 2⁄10, because most experts seem pretty coherent (within the domains they’re thinking about and trying to influence) and so the differences in impact depend on other factors.
Experts are coherent within the bounds of conventional study. When we try to apply that expertise to related topics that are less conventional (e.g. ML researchers on AGI; or even economists on what the most valuable interventions are) coherence drops very sharply. (I’m reminded of an interview where Tyler Cowen says that the most valuable cause area is banning alcohol, based on some personal intuitions.)
I feel like looking at any EA org’s report on estimation of their own impact makes it seem like “impact of past policies” is really difficult to evaluate?
The question is how it compares to estimating past correctness, where we face pretty similar problems. But mostly I think we don’t disagree too much on this question—I think epistemic evaluations are gonna be bigger either way, and I’m mostly just advocating for the “think-of-them-as-a-proxy” thing, which you might be doing but very few others are.
Note that this wouldn’t actually make a big change for AI alignment, since we don’t know how to use more funding.
Funding isn’t the only resource:
You’d change how you introduce people to alignment (since I’d guess that has a pretty strong causal impact on what worldviews they end up acting on). E.g. if you previously flipped a 10%-weighted coin to decide whether to send them down the Eliezer track or the other track, now you’d flip a 20%-weighted coin, and this straightforwardly leads to different numbers of people working on particular research agendas that the worldviews disagree about. Or if you imagine the community as a whole acting as an agent, you send 20% of the people to MIRI fellowships and the remainder to other fellowships (whereas previously it would be 10%).
(More broadly I think there’s a ton of stuff you do differently in community building, e.g. do you target people who know ML or people who are good at math?)
You’d change what you used political power for. I don’t particularly understand what policies Eliezer would advocate for but they seem different, e.g. I think I’m more keen on making sure particular alignment schemes for building AI systems get used and less keen on stopping everyone from doing stuff besides one secrecy-oriented lab that can become a leader.
Experts are coherent within the bounds of conventional study.
I was arguing that EV estimates have more than a 2x difference; I think this is pretty irrelevant to the deference model you’re suggesting (which I didn’t know you were suggesting at the time).
No, I don’t agree with that. It seems like all the worldviews are going to want resources (money / time) and access to that is ~zero-sum. (All the worldviews want “get more resources” so I’m assuming you’re already doing that as much as possible.) The bargaining helps you avoid wasting resources on counterproductive fighting between worldviews, it doesn’t change the amount of resources each worldview gets to spend.
Going from allocating 10% of your resources to 20% of your resources to a worldview seems like a big change. It’s a big difference if you start with twice as much money / time as you otherwise would have, unless there just happens to be a sharp drop in marginal utility of resources between those two points for some reason.
Maybe you think that there are lots of things one could do that have way more effect than “redirecting 10% of one’s resources” and so it’s not a big deal? If so can you give examples?
I agree overconfidence is common and you shouldn’t literally calculate a Brier score to figure out who to defer to.
I agree that directionally-correct beliefs are better correlated than calibrated credences.
When I say “evaluate beliefs” I mean “look at stated beliefs and see how reasonable they look overall, taking into account what other people thought when the beliefs were stated” and not “calculate a Brier score”; I think this post is obviously closer to the former than the latter.
I agree that people’s other goals make it harder to evaluate what their “true beliefs” are, and that’s one of the reasons I say it’s only 3⁄10 correlation.
Re: correlation, I was implicitly also asking the question “how much does this vary across experts”. Across the general population, maybe coherence is 7⁄10 correlated with expected future impact; across the experts that one would consider deferring to I think it is more like 2⁄10, because most experts seem pretty coherent (within the domains they’re thinking about and trying to influence) and so the differences in impact depend on other factors.
Re: evaluation, it seems way more common to me that there are multiple strong, coherent, conflicting views that all seem compelling (see epistemic learned helplessness), which do not seem to have been easily falsified by history (in sufficiently obvious manner that everyone agrees which one is false).
This too is in large part because we’re looking at experts in particular. I think we’re good at selecting for “enough coherence” before we consider someone an expert (if anything I think we do it too much in the “public intellectual” space), and so evaluating coherence well enough to find differences between experts ends up being pretty hard.
I feel like looking at any EA org’s report on estimation of their own impact makes it seem like “impact of past policies” is really difficult to evaluate?
Eliezer seems like a particularly easy case, where I agree his impact is probably net positive from getting people to do alignment work earlier, but even so I think there’s a bunch of questions that I’m uncertain about:
How bad is it that some people completely dismiss AI risk because they encountered Eliezer and found it off putting? (I’ve explicitly heard something along the lines of “that crazy stuff from Yudkowsky” from multiple ML researchers.)
How many people would be working on alignment without Eliezer’s work? (Not obviously hugely fewer, Superintelligence plausibly still gets published, Stuart Russell plausibly still goes around giving talks about value alignment and its importance.)
To what extent did Eliezer’s forceful rhetoric (as opposed to analytic argument) lead people to focus on the wrong problems?
I’ve now written up a more complete theory of deference here. I don’t expect that it directly resolves these disagreements, but hopefully it’s clearer than this thread.
Note that this wouldn’t actually make a big change for AI alignment, since we don’t know how to use more funding. It’d make a big change if we were talking about allocating people, but my general heuristic is that I’m most excited about people acting on strong worldviews of their own, and so I think the role of deference there should be much more limited than when it comes to money. (This all falls out of the theory I linked above.)
Experts are coherent within the bounds of conventional study. When we try to apply that expertise to related topics that are less conventional (e.g. ML researchers on AGI; or even economists on what the most valuable interventions are) coherence drops very sharply. (I’m reminded of an interview where Tyler Cowen says that the most valuable cause area is banning alcohol, based on some personal intuitions.)
The question is how it compares to estimating past correctness, where we face pretty similar problems. But mostly I think we don’t disagree too much on this question—I think epistemic evaluations are gonna be bigger either way, and I’m mostly just advocating for the “think-of-them-as-a-proxy” thing, which you might be doing but very few others are.
Funding isn’t the only resource:
You’d change how you introduce people to alignment (since I’d guess that has a pretty strong causal impact on what worldviews they end up acting on). E.g. if you previously flipped a 10%-weighted coin to decide whether to send them down the Eliezer track or the other track, now you’d flip a 20%-weighted coin, and this straightforwardly leads to different numbers of people working on particular research agendas that the worldviews disagree about. Or if you imagine the community as a whole acting as an agent, you send 20% of the people to MIRI fellowships and the remainder to other fellowships (whereas previously it would be 10%).
(More broadly I think there’s a ton of stuff you do differently in community building, e.g. do you target people who know ML or people who are good at math?)
You’d change what you used political power for. I don’t particularly understand what policies Eliezer would advocate for but they seem different, e.g. I think I’m more keen on making sure particular alignment schemes for building AI systems get used and less keen on stopping everyone from doing stuff besides one secrecy-oriented lab that can become a leader.
Yeah, that’s what I mean.