This comment is a general reply to this whole thread.
Some clarifications:
I don’t think that we should require that people working in AI safety have arguments for their research which are persuasive to anyone else. I’m saying I think they should have arguments which are persuasive to them.
I think that good plans involve doing things like playing around with ideas that excite you, and learning subjects which are only plausibly related if you have a hunch it could be helpful; I do these things a lot myself.
I think there’s a distinction between having an end-to-end story for your solution strategy vs the problem you’re trying to solve—I think it’s much more tractable to choose unusually important problems than to choose unusually effective research strategies.
In most fields, the reason you can pick more socially important problems is that people aren’t trying very hard to do useful work. It’s a little more surprising that you can beat the average in AI safety by trying intentionally to do useful work, but my anecdotal impression is that people who choose what problems to work on based on a model of what problems would be important to solve are still noticeably more effective.
Here’s my summary of my position here:
I think that being goal directed is very helpful to making progress on problems on a week-by-week or month-by-month scale.
I think that within most fields, some directions are much more promising than others, and backchaining is required in order to work on the promising directions. AI safety is a field like this. Math is another—if I decided to try to do good by going into math, I’d end up doing research which was really different from normal mathematicians. I agree with Paul Christiano’s old post about this.
If I wanted to maximize my probability of solving the Riemann hypothesis, I’d probably try to pursue some crazy plan involving weird strengths of mine and my impression of blind spots of the field. However, I don’t think this is actually that relevant, because I think that the important work in AI safety (and most other fields of relevance to EA) is less competitive than solving the Riemann hypothesis, and also a less challenging mathematical problem.
I think that in my experience, people who do the best work on AI safety generally have a clear end-to-end picture of the story for what work they need to do, and people who don’t have such a clear picture rarely do work I’m very excited about. Eg I think Nate Soares and Paul Christiano are both really good AI safety researchers, and both choose their research directions very carefully based on their sense of what problems are important to solve.
Sometimes I talk to people who are skeptical of EA because they have a stronger version of the position you’re presenting here—they think that nothing useful ever comes of people intentionally pursuing research that they think is important, and the right strategy is to pursue what you’re most interested in.
One way of thinking about this is to imagine that there are different problems in a field, and different researchers have different comparative advantages at the problems. In one extreme case, the problems vary wildly in importance, and so the comparative advantage basically doesn’t matter and you should work on what’s most important. In the other extreme, it’s really hard to get a sense of which things are likely to be more useful than other things, and your choices should be dominated by comparative advantage.
(Incidentally, you could also apply this to the more general problem of deciding what to work on as an EA. My personal sense is that the differences in values between different cause areas are big enough to basically dwarf comparative advantage arguments, but within a cause area comparative advantage is the dominant consideration.)
I would love to see a high quality investigation of historical examples here.
I mostly share your position, except that I think that you would perhaps maximize the probability of solving the Riemann hypothesis by going into paths on the frontline of current research instead of starting something new (but I imagine that there are many promising paths currently, which may be the difference).
This planners vs Hayekian genre of dilemmas seems very important to me, and it might be a crux in my career trajectory or at least impact possible projects I’m taking. I intuitively think that this question can be dissolved quite easily to make it obvious when each strategy is better, how parts of the EA world-view influences the answer and perhaps how this impacts how we think about academic research. There is also a lot of existing literature on this matter, so there might already be a satisfying argument.
If someone here is up to a (possibly adversarial) collaboration on the topic, let’s do it!
The Planners vs Hayekian dillema seems related to some of the discussion in Realism about rationality, and especially this crux for Abram Demski and Rohin Shah.
Broadly, two types of strategies in technical AI alignment work are
Build a solid mathematical foundations on which to build further knowledge which would eventually serve to reason more clearly on AI alignment.
Focus on targeted problems we can see today which are directly related to risks from advanced AI, and do our best to solve these (by heuristics or tracing back to related mathematical questions).
Borrowing Vanessa’s analogy of understanding the world as a castle, each floor built on the one underneath representing knowledge hierarchically built, when one wants to build a castle with unknown materials and unknown set of rules for it’s construction with a specific tower top in mind, one can either start by building the groundwork well or by starting with some ideas of what can by directly below the tower top.
Planners start from the towers top, while Hayekians want to build a solid ground and add on as many well placed floors as they can.
This comment is a general reply to this whole thread.
Some clarifications:
I don’t think that we should require that people working in AI safety have arguments for their research which are persuasive to anyone else. I’m saying I think they should have arguments which are persuasive to them.
I think that good plans involve doing things like playing around with ideas that excite you, and learning subjects which are only plausibly related if you have a hunch it could be helpful; I do these things a lot myself.
I think there’s a distinction between having an end-to-end story for your solution strategy vs the problem you’re trying to solve—I think it’s much more tractable to choose unusually important problems than to choose unusually effective research strategies.
In most fields, the reason you can pick more socially important problems is that people aren’t trying very hard to do useful work. It’s a little more surprising that you can beat the average in AI safety by trying intentionally to do useful work, but my anecdotal impression is that people who choose what problems to work on based on a model of what problems would be important to solve are still noticeably more effective.
Here’s my summary of my position here:
I think that being goal directed is very helpful to making progress on problems on a week-by-week or month-by-month scale.
I think that within most fields, some directions are much more promising than others, and backchaining is required in order to work on the promising directions. AI safety is a field like this. Math is another—if I decided to try to do good by going into math, I’d end up doing research which was really different from normal mathematicians. I agree with Paul Christiano’s old post about this.
If I wanted to maximize my probability of solving the Riemann hypothesis, I’d probably try to pursue some crazy plan involving weird strengths of mine and my impression of blind spots of the field. However, I don’t think this is actually that relevant, because I think that the important work in AI safety (and most other fields of relevance to EA) is less competitive than solving the Riemann hypothesis, and also a less challenging mathematical problem.
I think that in my experience, people who do the best work on AI safety generally have a clear end-to-end picture of the story for what work they need to do, and people who don’t have such a clear picture rarely do work I’m very excited about. Eg I think Nate Soares and Paul Christiano are both really good AI safety researchers, and both choose their research directions very carefully based on their sense of what problems are important to solve.
Sometimes I talk to people who are skeptical of EA because they have a stronger version of the position you’re presenting here—they think that nothing useful ever comes of people intentionally pursuing research that they think is important, and the right strategy is to pursue what you’re most interested in.
One way of thinking about this is to imagine that there are different problems in a field, and different researchers have different comparative advantages at the problems. In one extreme case, the problems vary wildly in importance, and so the comparative advantage basically doesn’t matter and you should work on what’s most important. In the other extreme, it’s really hard to get a sense of which things are likely to be more useful than other things, and your choices should be dominated by comparative advantage.
(Incidentally, you could also apply this to the more general problem of deciding what to work on as an EA. My personal sense is that the differences in values between different cause areas are big enough to basically dwarf comparative advantage arguments, but within a cause area comparative advantage is the dominant consideration.)
I would love to see a high quality investigation of historical examples here.
I mostly share your position, except that I think that you would perhaps maximize the probability of solving the Riemann hypothesis by going into paths on the frontline of current research instead of starting something new (but I imagine that there are many promising paths currently, which may be the difference).
This planners vs Hayekian genre of dilemmas seems very important to me, and it might be a crux in my career trajectory or at least impact possible projects I’m taking. I intuitively think that this question can be dissolved quite easily to make it obvious when each strategy is better, how parts of the EA world-view influences the answer and perhaps how this impacts how we think about academic research. There is also a lot of existing literature on this matter, so there might already be a satisfying argument.
If someone here is up to a (possibly adversarial) collaboration on the topic, let’s do it!
The Planners vs Hayekian dillema seems related to some of the discussion in Realism about rationality, and especially this crux for Abram Demski and Rohin Shah.
Broadly, two types of strategies in technical AI alignment work are
Build a solid mathematical foundations on which to build further knowledge which would eventually serve to reason more clearly on AI alignment.
Focus on targeted problems we can see today which are directly related to risks from advanced AI, and do our best to solve these (by heuristics or tracing back to related mathematical questions).
Borrowing Vanessa’s analogy of understanding the world as a castle, each floor built on the one underneath representing knowledge hierarchically built, when one wants to build a castle with unknown materials and unknown set of rules for it’s construction with a specific tower top in mind, one can either start by building the groundwork well or by starting with some ideas of what can by directly below the tower top.
Planners start from the towers top, while Hayekians want to build a solid ground and add on as many well placed floors as they can.