Thanks for this in-depth writeup of what is clearly a very important factor in prioritising our work aimed at the AI transition. Your piece has built the argument for such prioritisation clearly enough that it has allowed me to put some previously inchoate responses into a more crisp form:
If we could tell with certainty which topics would receive >100x as much work as we could put in prior to when that work is needed, then I think your argument goes through. But I have a lot of uncertainty about that and such uncertainty weakens the prioritisation effect substantially.
To see the effect easily, suppose for simplicity that for some piece of apparently late-stage strategy there is a 50% chance that >100x as much work gets done on it, obviating the need for us to work on it now, and a 50% chance that there is no appreciable extra work done (e.g. because the intelligence explosion is happening in a particular lab that doesn’t do this work, or because the work requires aspects of cognition that are improving more slowly, or because it turns out it was needed earlier in the explosion than expected).
In this case, the expected value of marginal work on that late-stage strategy gets roughly halved compared to if there weren’t going to be this AI-driven work later (50% chance of the naive estimate + 50% chance of <1% of that estimate). Given the fairly extreme distribution in the value of a particular person working on different topics, it isn’t that rare for the best thing of one category to work on being >2x as good as the best thing from another category, such that you shouldn’t switch category even after downgrading the EV.
That would mean early-stage vs late-stage would be an important factor in choosing what to work on, but not any kind of filter as to what to work on. As the chance of large amounts of AI work on that topic increases, the factor gets stronger. e.g. it reaches 10x at a 90% chance, which is quite strong (though I think it is hard to reach or exceed a 90% chance here).
So I think this can have a substantial effect on the choice of what to work on on the margins, but isn’t a filter.
What about its effect on the portfolio of research work aimed at the AI transition?
Suppose that there are logarithmic returns to the research work (which means that the marginal value of extra work is inversely proportional to aggregate work so far, which is a common neglectedness assumption). In that case, we should do 50% as much total work on equally-important things that we estimate to have a 50% chance of being obviated later, and 10% as much total work on those we estimate to have a 90% chance of being obviated later.
So that is still quite a lot of the share of our total work into late-stage things even when we don’t think they are intrinsically more important. In the piece you suggested that we do at least some work on these topics, to avoid the possibility of being caught completely flat-footed if the anticipated AI-work on those topics doesn’t happen, and I think the maths above suggests a larger amount of work than that (especially on topics that appear to be more important or more tractable).
(Note that my simplifying assumption of no appreciable AI help vs an overwhelming amount might be doing some work here. I’m not sure what the best way to relax it is.)
Thanks, I agree with your mathematics and think this framework is helpful for letting us zoom in to possible disagreements.
There are two places where I find myself sceptical of the framing in your comment:
You’re framing it as something like a haircut on the value of working on a particular topic. But I think that if you’re targeting just some set of worlds where we don’t get meaningful automated assistance for certain kinds of strategy work, it might be important to explicitly condition on that, rather than just think of it as a haircut.
We might then also ask about whether we have more or less leverage on these worlds. Some takes:
Of course it will depend a bunch on the specifics of the question we’re wondering whether we can punt.
Broadly I think worlds where we don’t get a bunch of automated assistance for strategy in a timely fashion look significantly worse than worlds where we do get this assistance.
This is compatible with either being higher-leverage.
An important type of leverage we may is the possibility of moving worlds from the first bucket to the second.
I’m actually pretty unsure which worlds we have more leverage over (there seem to be quite a lot of considerations pointing each way), and suspect that this question is more to-the-point than thinking of it as a haircut.
You say that it’s hard to reach or exceed a 90% chance. I find myself dubious of this. If we see something like a technological singularity, there will be an enormous amount of progress on very many dimensions. Some of the things that actors will eventually want to do will be tremendously harder than others. I certainly don’t want to assume that people will get around to doing things that are a good idea and are possible as early as it’s feasible for them to do so; but I think that if you go somewhat past that to the point where it’s very easy to get the good thing, it’s not hard for it to be a >90% safe bet to think it will happen.
As an analogy, I imagine people before the industrial revolution might reasonably have predicted that there would be a lot more capacity for thinking about space strategy before anyone went to the moon.
Maybe there’s a common theme here: I have the impression that I’m more imagining a default world where we get these upgrades to strategic capacity in a timely fashion, and then considering deviations from that; and you’re more saying “well maybe things look like that, but maybe they look quite different”, and less privileging the hypothesis.
I guess I do just think it’s appropriate to privilege this hypothesis. We’ve written about how even current or near-term AI could serve to power tools which advance our strategic understanding. I think that this is a sufficiently obvious set of things to build, and there will be sufficient appetite to build them, that it’s fair to think it will likely be getting in gear (in some form or another) before most radically transformative impacts hit. I wouldn’t want to bet everything on this hypothesis, but I do think it’s worth exploring what betting on it properly would look like, and then committing a chunk of our portfolio to that (if it’s not actively bad on other perspectives).
Re conditioning, I agree that this is the technically correct thing to do and that it isn’t clear what difference it makes to the more simple analysis. In some cases it is fairly easy to condition (e.g. if working on a late-stage topic, one can do the project imagining that there isn’t lots of advanced AI advice in time when it arrives), while at the prioritisation stage it feels a bit harder to do. Oh, and I very much agree that it could be important to act to change whether such AI analysis happens (something that is, if anything, a bit easier to see on a view that treats whether this happens as uncertain).
Re maximal reasonable probabilities, I still genuinely feel like it is hard to get >90% credence that very large amounts of AI analysis on a key issue will happen prior to the issue coming to a head. I think one could get there for some things, but not that many. This is due to there being a variety of defeaters for such high amounts of AI analysis, such as external people like us not having access to the tools, needing the analysis earlier than expected (e.g. due to the need to socialise the ideas), jaggedness in the AI capabilities (e.g. where its engineering abilities take off substantially before more conceptual, philosophical abilities). I think you are onto something re what you are imagining as default vs what I am.
Thanks for this in-depth writeup of what is clearly a very important factor in prioritising our work aimed at the AI transition. Your piece has built the argument for such prioritisation clearly enough that it has allowed me to put some previously inchoate responses into a more crisp form:
If we could tell with certainty which topics would receive >100x as much work as we could put in prior to when that work is needed, then I think your argument goes through. But I have a lot of uncertainty about that and such uncertainty weakens the prioritisation effect substantially.
To see the effect easily, suppose for simplicity that for some piece of apparently late-stage strategy there is a 50% chance that >100x as much work gets done on it, obviating the need for us to work on it now, and a 50% chance that there is no appreciable extra work done (e.g. because the intelligence explosion is happening in a particular lab that doesn’t do this work, or because the work requires aspects of cognition that are improving more slowly, or because it turns out it was needed earlier in the explosion than expected).
In this case, the expected value of marginal work on that late-stage strategy gets roughly halved compared to if there weren’t going to be this AI-driven work later (50% chance of the naive estimate + 50% chance of <1% of that estimate). Given the fairly extreme distribution in the value of a particular person working on different topics, it isn’t that rare for the best thing of one category to work on being >2x as good as the best thing from another category, such that you shouldn’t switch category even after downgrading the EV.
That would mean early-stage vs late-stage would be an important factor in choosing what to work on, but not any kind of filter as to what to work on. As the chance of large amounts of AI work on that topic increases, the factor gets stronger. e.g. it reaches 10x at a 90% chance, which is quite strong (though I think it is hard to reach or exceed a 90% chance here).
So I think this can have a substantial effect on the choice of what to work on on the margins, but isn’t a filter.
What about its effect on the portfolio of research work aimed at the AI transition?
Suppose that there are logarithmic returns to the research work (which means that the marginal value of extra work is inversely proportional to aggregate work so far, which is a common neglectedness assumption). In that case, we should do 50% as much total work on equally-important things that we estimate to have a 50% chance of being obviated later, and 10% as much total work on those we estimate to have a 90% chance of being obviated later.
So that is still quite a lot of the share of our total work into late-stage things even when we don’t think they are intrinsically more important. In the piece you suggested that we do at least some work on these topics, to avoid the possibility of being caught completely flat-footed if the anticipated AI-work on those topics doesn’t happen, and I think the maths above suggests a larger amount of work than that (especially on topics that appear to be more important or more tractable).
(Note that my simplifying assumption of no appreciable AI help vs an overwhelming amount might be doing some work here. I’m not sure what the best way to relax it is.)
Thanks, I agree with your mathematics and think this framework is helpful for letting us zoom in to possible disagreements.
There are two places where I find myself sceptical of the framing in your comment:
You’re framing it as something like a haircut on the value of working on a particular topic. But I think that if you’re targeting just some set of worlds where we don’t get meaningful automated assistance for certain kinds of strategy work, it might be important to explicitly condition on that, rather than just think of it as a haircut.
We might then also ask about whether we have more or less leverage on these worlds. Some takes:
Of course it will depend a bunch on the specifics of the question we’re wondering whether we can punt.
Broadly I think worlds where we don’t get a bunch of automated assistance for strategy in a timely fashion look significantly worse than worlds where we do get this assistance.
This is compatible with either being higher-leverage.
An important type of leverage we may is the possibility of moving worlds from the first bucket to the second.
I’m actually pretty unsure which worlds we have more leverage over (there seem to be quite a lot of considerations pointing each way), and suspect that this question is more to-the-point than thinking of it as a haircut.
You say that it’s hard to reach or exceed a 90% chance. I find myself dubious of this. If we see something like a technological singularity, there will be an enormous amount of progress on very many dimensions. Some of the things that actors will eventually want to do will be tremendously harder than others. I certainly don’t want to assume that people will get around to doing things that are a good idea and are possible as early as it’s feasible for them to do so; but I think that if you go somewhat past that to the point where it’s very easy to get the good thing, it’s not hard for it to be a >90% safe bet to think it will happen.
As an analogy, I imagine people before the industrial revolution might reasonably have predicted that there would be a lot more capacity for thinking about space strategy before anyone went to the moon.
Maybe there’s a common theme here: I have the impression that I’m more imagining a default world where we get these upgrades to strategic capacity in a timely fashion, and then considering deviations from that; and you’re more saying “well maybe things look like that, but maybe they look quite different”, and less privileging the hypothesis.
I guess I do just think it’s appropriate to privilege this hypothesis. We’ve written about how even current or near-term AI could serve to power tools which advance our strategic understanding. I think that this is a sufficiently obvious set of things to build, and there will be sufficient appetite to build them, that it’s fair to think it will likely be getting in gear (in some form or another) before most radically transformative impacts hit. I wouldn’t want to bet everything on this hypothesis, but I do think it’s worth exploring what betting on it properly would look like, and then committing a chunk of our portfolio to that (if it’s not actively bad on other perspectives).
Thanks Owen. I also agree with your maths.
Re conditioning, I agree that this is the technically correct thing to do and that it isn’t clear what difference it makes to the more simple analysis. In some cases it is fairly easy to condition (e.g. if working on a late-stage topic, one can do the project imagining that there isn’t lots of advanced AI advice in time when it arrives), while at the prioritisation stage it feels a bit harder to do. Oh, and I very much agree that it could be important to act to change whether such AI analysis happens (something that is, if anything, a bit easier to see on a view that treats whether this happens as uncertain).
Re maximal reasonable probabilities, I still genuinely feel like it is hard to get >90% credence that very large amounts of AI analysis on a key issue will happen prior to the issue coming to a head. I think one could get there for some things, but not that many. This is due to there being a variety of defeaters for such high amounts of AI analysis, such as external people like us not having access to the tools, needing the analysis earlier than expected (e.g. due to the need to socialise the ideas), jaggedness in the AI capabilities (e.g. where its engineering abilities take off substantially before more conceptual, philosophical abilities). I think you are onto something re what you are imagining as default vs what I am.