Buck comments on I’m Buck Shlegeris, I do research and outreach at MIRI, AMA

Buck Nov 21, 2019, 1:24 AM
24 points
0 ∶ 0
I worry very little about losing the opportunity to get external criticism from people who wouldn’t engage very deeply with our work if they did have access to it. I worry more about us doing worse research because it’s harder for extremely engaged outsiders to contribute to our work.
A few years ago, Holden had a great post where he wrote:

For nearly a decade now, we’ve been putting a huge amount of work into putting the details of our reasoning out in public, and yet I am hard-pressed to think of cases (especially in more recent years) where a public comment from an unexpected source raised novel important considerations, leading to a change in views. This isn’t because nobody has raised novel important considerations, and it certainly isn’t because we haven’t changed our views. Rather, it seems to be the case that we get a large amount of valuable and important criticism from a relatively small number of highly engaged, highly informed people. Such people tend to spend a lot of time reading, thinking and writing about relevant topics, to follow our work closely, and to have a great deal of context. They also tend to be people who form relationships of some sort with us beyond public discourse.
The feedback and questions we get from outside of this set of people are often reasonable but familiar, seemingly unreasonable, or difficult for us to make sense of. In many cases, it may be that we’re wrong and our external critics are right; our lack of learning from these external critics may reflect our own flaws, or difficulties inherent to a situation where people who have thought about a topic at length, forming their own intellectual frameworks and presuppositions, try to learn from people who bring very different communication styles and presuppositions.
The dynamic seems quite similar to that of academia: academics tend to get very deep into their topics and intellectual frameworks, and it is quite unusual for them to be moved by the arguments of those unfamiliar with their field. I think it is sometimes justified and sometimes unjustified to be so unmoved by arguments from outsiders.
Regardless of the underlying reasons, we have put a lot of effort over a long period of time into public discourse, and have reaped very little of this particular kind of benefit (though we have reaped other benefits—more below). I’m aware that this claim may strike some as unlikely and/or disappointing, but it is my lived experience, and I think at this point it would be hard to argue that it is simply explained by a lack of effort or interest in public discourse.
My sense is pretty similar to Holden’s, though we’ve put much less effort into explaining ourselves publicly. When we’re thinking about topics like decision theory which have a whole academic field, we seem to get very little out of interacting with the field. This might be because we’re actually interested in different questions and academic decision theory doesn’t have much to offer us (eg see this Paul Christiano quote and this comment).
I think that MIRI also empirically doesn’t change its strategy much as a result of talking to highly engaged people who have very different world views (eg Paul Christiano), though individual researchers (eg me) often change their minds from talking to these people. (Personally, I also change my mind from talking to non-very-engaged people.)
Maybe talking to outsiders doesn’t shift MIRI strategy because we’re totally confused about how to think about all of this. But I’d be surprised if we figured this out soon given that we haven’t figured it so far. So I’m pretty willing to say “look, either MIRI’s onto something or not; if we’re onto something, we should go for it wholeheartedly, and I don’t seriously think that we’re going to update our beliefs much from more public discourse, so it doesn’t that seem costly to have our public discourse become costlier”.
I guess I generally don’t feel that convinced that external criticism is very helpful for situations like ours where there isn’t an established research community with taste that is relevant to our work. Physicists have had a lot of time to develop a reasonably healthy research culture where they notice what kinds of arguments are wrong; I don’t think AI alignment has that resource to draw on. And in cases where you don’t have an established base of knowledge about what kinds of arguments are helpful (sometimes people call this “being in a preparadigmatic field”; I don’t know if that’s correct usage), I think it’s plausible that people with different intuitions should do divergent work for a while and hope that eventually some of them make progress that’s persuasive to the others.
By not engaging with critics as much as we could, I think MIRI is probably increasing the probability that we’re barking completely up the wrong tree. I just think that this gamble is worth taking.
I’m more concerned about costs incurred because we’re more careful about sharing research with highly engaged outsiders who could help us with it. Eg Paul has made some significant contributions to MIRI’s research, and it’s a shame to have less access to his ideas about our problems.
What links here?
- Critiques of prominent AI safety labs: Redwood Research by Omega (Mar 31, 2023, 8:58 AM; 339 points)
- Critiques of prominent AI safety labs: Redwood Research by Omega. (LessWrong; Apr 17, 2023, 6:20 PM; 4 points)