Research Fellow at the Center for AI Safety
rgb
“I’ve learned to motivate myself, create mini-deadlines, etc. This is a constant work in progress—I still have entire days where I don’t focus on what I should be doing—but I’ve gotten way better.”
What do you think has led to this improvement, aside from just time and practice? Favorite tips / tricks / resources?
Thanks for this. I was curious about “Pick a niche or undervalued area and become the most knowledgeable person in it.” Do you feel comfortable saying what the niche was? Or even if not, can you say a bit more about how you went about doing this?
This is very interesting! I’m excited to see connections drawn between AI safety and the law / philosophy of law. It seems there are a lot of fruitful insights to be had.
You write,
The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.
Can you elaborate a bit on this?
I don’t know anything about the history of these rules about evidence. But why think that over this history, these rules have trended towards truth-seeking per se? I wouldn’t be surprised if the rules have evolved to better serve the purposes of the legal system over time, but presumably the relationship between this end and truth-seeking is quite complex. Also, people changing the rules could be mistaken about what sorts of evidence do in fact tend to lead to wrong decisions.
I think all of this is compatible with your claim. But I’d like to hear more!
Thanks for the great summary! A few questions about it
1. You call mesa-optimization “the best current case for AI risk”. As Ben noted at the time of the interview, this argument hasn’t yet really been fleshed out in detail. And as Rohin subsequently wrote in his opinion of the mesa-optimization paper, “it is not yet clear whether mesa optimizers will actually arise in practice”. Do you have thoughts on what exactly the “Argument for AI Risk from Mesa-Optimization” is, and/or a pointer to the places where, in your opinion, that argument has been made (aside from the original paper)?
2. I don’t entirely understand the remark about the reference class of ‘new intelligent species’. What species are in that reference class? Many species which we regard as quite intelligent (orangutans, octopuses, New Caledonian crows) aren’t risky. Probably, you mean a reference class like “new species as smart as humans” or “new ‘generally intelligent’ species”. But then we have a very small reference class and it’s hard to know how strong that prior should be. In any case, how were you thinking of this reference class argument?
3. ‘The Boss Baby’, starring Alec Baldwin, is available for rental on Amazon Prime Video for $3.99. I suppose this is more of a comment than a question.
That’s a great point. A related point that I hadn’t really clocked until someone pointed it out to me recently, though it’s obvious in retrospect, is that (EA aside) in an academic department it is structurally unlikely that you will have a colleague who shares your research interests to a large extent. Since it’s rare that a department is big enough to have two people doing the same thing, and departments need coverage of their whole field, for teaching and supervision.