I’ve now written up a more complete theory of deference here. I don’t expect that it directly resolves these disagreements, but hopefully it’s clearer than this thread.
Going from allocating 10% of your resources to 20% of your resources to a worldview seems like a big change.
Note that this wouldn’t actually make a big change for AI alignment, since we don’t know how to use more funding. It’d make a big change if we were talking about allocating people, but my general heuristic is that I’m most excited about people acting on strong worldviews of their own, and so I think the role of deference there should be much more limited than when it comes to money. (This all falls out of the theory I linked above.)
Across the general population, maybe coherence is 7⁄10 correlated with expected future impact; across the experts that one would consider deferring to I think it is more like 2⁄10, because most experts seem pretty coherent (within the domains they’re thinking about and trying to influence) and so the differences in impact depend on other factors.
Experts are coherent within the bounds of conventional study. When we try to apply that expertise to related topics that are less conventional (e.g. ML researchers on AGI; or even economists on what the most valuable interventions are) coherence drops very sharply. (I’m reminded of an interview where Tyler Cowen says that the most valuable cause area is banning alcohol, based on some personal intuitions.)
I feel like looking at any EA org’s report on estimation of their own impact makes it seem like “impact of past policies” is really difficult to evaluate?
The question is how it compares to estimating past correctness, where we face pretty similar problems. But mostly I think we don’t disagree too much on this question—I think epistemic evaluations are gonna be bigger either way, and I’m mostly just advocating for the “think-of-them-as-a-proxy” thing, which you might be doing but very few others are.
Note that this wouldn’t actually make a big change for AI alignment, since we don’t know how to use more funding.
Funding isn’t the only resource:
You’d change how you introduce people to alignment (since I’d guess that has a pretty strong causal impact on what worldviews they end up acting on). E.g. if you previously flipped a 10%-weighted coin to decide whether to send them down the Eliezer track or the other track, now you’d flip a 20%-weighted coin, and this straightforwardly leads to different numbers of people working on particular research agendas that the worldviews disagree about. Or if you imagine the community as a whole acting as an agent, you send 20% of the people to MIRI fellowships and the remainder to other fellowships (whereas previously it would be 10%).
(More broadly I think there’s a ton of stuff you do differently in community building, e.g. do you target people who know ML or people who are good at math?)
You’d change what you used political power for. I don’t particularly understand what policies Eliezer would advocate for but they seem different, e.g. I think I’m more keen on making sure particular alignment schemes for building AI systems get used and less keen on stopping everyone from doing stuff besides one secrecy-oriented lab that can become a leader.
Experts are coherent within the bounds of conventional study.
I’ve now written up a more complete theory of deference here. I don’t expect that it directly resolves these disagreements, but hopefully it’s clearer than this thread.
Note that this wouldn’t actually make a big change for AI alignment, since we don’t know how to use more funding. It’d make a big change if we were talking about allocating people, but my general heuristic is that I’m most excited about people acting on strong worldviews of their own, and so I think the role of deference there should be much more limited than when it comes to money. (This all falls out of the theory I linked above.)
Experts are coherent within the bounds of conventional study. When we try to apply that expertise to related topics that are less conventional (e.g. ML researchers on AGI; or even economists on what the most valuable interventions are) coherence drops very sharply. (I’m reminded of an interview where Tyler Cowen says that the most valuable cause area is banning alcohol, based on some personal intuitions.)
The question is how it compares to estimating past correctness, where we face pretty similar problems. But mostly I think we don’t disagree too much on this question—I think epistemic evaluations are gonna be bigger either way, and I’m mostly just advocating for the “think-of-them-as-a-proxy” thing, which you might be doing but very few others are.
Funding isn’t the only resource:
You’d change how you introduce people to alignment (since I’d guess that has a pretty strong causal impact on what worldviews they end up acting on). E.g. if you previously flipped a 10%-weighted coin to decide whether to send them down the Eliezer track or the other track, now you’d flip a 20%-weighted coin, and this straightforwardly leads to different numbers of people working on particular research agendas that the worldviews disagree about. Or if you imagine the community as a whole acting as an agent, you send 20% of the people to MIRI fellowships and the remainder to other fellowships (whereas previously it would be 10%).
(More broadly I think there’s a ton of stuff you do differently in community building, e.g. do you target people who know ML or people who are good at math?)
You’d change what you used political power for. I don’t particularly understand what policies Eliezer would advocate for but they seem different, e.g. I think I’m more keen on making sure particular alignment schemes for building AI systems get used and less keen on stopping everyone from doing stuff besides one secrecy-oriented lab that can become a leader.
Yeah, that’s what I mean.