(When I say “AGI” I think I’m talking about the same thing that you called digital “beings” in this comment.)
Here are a bunch of agreements & disagreements.
if François is right, then I think this should be considered strong evidence that work on AI Safety is not overwhelmingly valuable,and may not be one of the most promising ways to have a positive impact on the world.
I think François is right, but I do think that work on AI safety is overwhelmingly valuable.
Here’s an allegory:
There’s a fast-breeding species of extraordinarily competent and ambitious intelligent aliens. They can do science much much better than Einstein, they can run businesses much much better than Bezos, they can win allies and influence much much better than Hitler or Stalin, etc. And they’re almost definitely (say >>90% chance) coming to Earth sooner or later, in massive numbers that will keep inexorably growing, but we don’t know exactly when this will happen, and we also don’t know in great detail what these aliens will be like—maybe they will have callous disregard for human welfare, or maybe they’ll be great. People have been sounding the alarm for decades that this is a big friggin’ deal that warrants great care and advanced planning, but basically nobody cares.
Then some scientist Dr. S says “hey those dots in the sky—maybe they’re the aliens! If so they might arrive in the next 5-10 years, and they’ll have the following specific properties”. All of the sudden there’s a massive influx of societal interest—interest in the dots in particular, and interest in alien preparation in general.
But it turns out that Dr. S was wrong! The dots are small meteors. They might hit earth and cause minor damage but nothing unprecedented. So we’re back to not knowing when the aliens will come or what exactly they’ll be like.
Is Dr. S’s mistake “strong evidence that alien prep is not overwhelmingly valuable”? No! It just puts us back where we were before Dr. S came along.
(end of allegory)
(Glossary: the “aliens” are AGIs; the dots in the sky are LLMs; and Dr. S would be a guy saying LLMs will scale to AGI with no additional algorithmic insights.)
It would make AI Safety work less tractable
If LLMs will plateau (as I expect), I think there are nevertheless lots of tractable projects that would help AGI safety. Examples include:
The human brain runs some sort of algorithm to figure things out and gets things done and invent technology etc. We don’t know exactly what that algorithm is (or else we would already have AGI), but we know much more than zero about it, and it’s obviously at least possible that AGI will be based on similar algorithms. (I actually believe something stronger, i.e. that it’s likely, but of course that’s hard to prove.) So now this is a pretty specific plausible AGI scenario that we can try to plan for. And that’s my own main research interest—see Intro to Brain-Like-AGI Safety. (Other varieties of model-based reinforcement learning would be pretty similar too.) Anyway, there’s tons of work to do on planning for that.
…For example, I list seven projects here. Some of those (e.g. this) seem robustly useful regardless of how AGI will work, i.e. even if future AGI is neither brain-like nor LLM-like, but rather some yet-unknown third category of mystery algorithm.
Outreach—After the invention of AGI (again, what you called digital “beings”), there are some obvious-to-me consequences, like “obviously human extinction is on the table as a possibility”, and “obviously permanent human oversight of AGIs in the long term would be extraordinarily difficult if not impossible” and “obviously AGI safety will be hard to assess in advance” and “if humans survive, those humans will obviously not be doing important science and founding important companies given competition from trillions of much-much-more-competent AGIs, just like moody 7-year-olds are not doing important science and founding important companies today” and “obviously there will be many severe coordination problems involved in the development and use of AGI technology”. But, as obvious as those things are to me, hoo boy there sure are tons of smart prominent people who would very confidently disagree with all of those. And that seems clearly bad. So trying to gradually establish good common knowledge of basic obvious things like that, through patient outreach and pedagogy, seems robustly useful and tractable to me.
Policy—I think there are at least a couple governance and policy interventions that are robustly useful regardless of whether AGI is based on LLMs (as others expect) or not (as I expect). For example, I think there’s room for building better institutions through which current and future tech companies (and governments around the world) can cooperate on safety as AGI approaches (whenever that may happen).
It seems that many people in Open Phil have substantially shortened their timelines recently (see Ajeya here).
For what it’s worth, Yann LeCun is very confidently against LLMs scaling to AGI, and yet LeCun seems to have at least vaguely similar timelines-to-AGI as Ajeya does in that link.
Ditto for me.
See also my discussion here (“30 years is a long time. A lot can happen. Thirty years ago, deep learning was an obscure backwater within AI, and meanwhile people would brag about how their fancy new home computer had a whopping 8 MB of RAM…”)
To be clear, you can definitely find some people in AI safety saying AGI is likely in <5 years, although Ajeya is not one of those people. This is a more extreme claim, and does seem pretty implausible unless LLMs will scale to AGI.
I think this makes me very concern of a strong ideological and philosophical bubble in the Bay regarding these core questions of AI.
Yeah some examples would be:
many AI safety people seem happy to make confident guesses about what tasks the first AGIs will be better and worse at doing based on current LLM capabilities;
many AI safety people seem happy to make confident guesses about how much compute the first AGIs will require based on current LLM compute requirements;
many AI safety people seem happy to make confident guesses about which companies are likely to develop AGIs based on which companies are best at training LLMs today;
many AI safety people seem happy to make confident guesses about AGI UIs based on the particular LLM interface of “context window → output token”;
etc. etc.
Many ≠ All! But to the extent that these things happen, I’m against it, and I do complain about it regularly.
(To be clear, I’m not opposed to contingency-planning for the possibility that LLMs will scale to AGIs. I don’t expect that contingency to happen, but hey, what do I know, I’ve been wrong before, and so has Chollet. But I find that these kinds of claims above are often stated unconditionally. Or even if they’re stated conditionally, the conditionality is kinda forgotten in practice.)
By the way, this might be overly-cynical, but I think there are some people (coming into the AI safety field very recently) who understand how LLMs work but don’t know how (for example) model-based reinforcement learning works, and so they just assume that the way LLMs work is the only possible way for any AI algorithm to work.
Hey Steven! As always I really appreciate your engagement here, and I’m going to have to really simplify but I really appreciate your links[1] and I’m definitely going to check them out 🙂
I think François is right, but I do think that work on AI safety is overwhelmingly valuable.
Here’s an allegory:
I think the most relevant disagreement that we have[2]is the beginning of your allegory. To indulge it, I don’t think we have knowledge of the intelligent alien species coming to earth, and to the extent we have a conceptual basis for them we can’t see any signs of them in the sky. Pair this with the EA concern that we should be concerned about the counterfactual impact of our actions, and that there are opportunities to do good right here and now,[3] it shouldn’t be a primary EA concern.
Now, what would make it a primary concern is if Dr S is right and that the aliens are spotted and that they’re on their way, but I don’t think he’s right. And, to stretch the analogy to breaking point, I’d be very upset that after I turned my telescope to the co-ordinates Dr S mentions and seeing meteors instead of spaceships, that significant parts of the EA movement were still wanting to have more funding to construct the ultimate-anti-alien-space-laser or do alien-defence-research instead of buying bednets.
(When I say “AGI” I think I’m talking about the same thing that you called digital “beings” in this comment.)
A secondary crux I have is that a ‘digital being’ in the sense I describe, and possibly the AGI you think of, will likely exhibit certain autopoietic properties that make it significantly different from either the paperclip maxermiser or a ‘foom-ing’ ASI. This is highly speculative though, based on a lot of philosophical intuitions, and I wouldn’t want to bet humanity’s future on it at all in the case where we did see aliens in the sky.
To be clear, you can definitely find some people in AI safety saying AGI is likely in <5 years, although Ajeya is not one of those people. This is a more extreme claim, and does seem pretty implausible unless LLMs will scale to AGI.
My take on it, though I admit driven by selection bias on Twitter, is that many people in the Bay-Social-Scene are buying into the <5 year timelines. Aschenbrenner for sure, Kokotajlo as well, and even maybe Amodei[4] as well? (Edit: Also lots of prominent AI Safety Twitter accounts seem to have bought fully into this worldview, such as the awful ‘AI Safety Memes’ account) However, I do agree it’s not all of AI Safety for sure! I just don’t think it that, once you take away that urgency and certainy of the probelm, it ought to be considered the world’s “most pressing problem”, at least without further controversial philosophical assumptions.
I’d argue through increasing human flourishing and reducing the suffering we inflict on animals, but you could sub in your own cause area here for instance, e.g. ‘preventing nuclear war’ if you thought that was both likely and an x-risk
See the transcript with Dwarkesh at 00:24:26 onwards where he says that superhuman/transformative AI capabilities will come within ‘a few years’ of the interview’s date (so within a few years of summer 2023)
Pair this with the EA concern that we should be concerned about the counterfactual impact of our actions, and that there are opportunities to do good right here and now,[3] it shouldn’t be a primary EA concern.
As in, your crux is that the probability of AGI within the next 50 years is less than 10%?
I think from an x-risk perspective it is quite hard to beat AI risk even on pretty long timelines. (Where the main question is bio risk and what you think about (likely temporary) civilizational collapse due to nuclear war.)
It’s pretty plausible that on longer timelines technical alignment/safety work looks weak relative to other stuff focused on making AI go better.
As in, your crux is that the probability of AGI within the next 50 years is less than 10%?
I’m essentially deeply uncertain about how to answer this question, in a true ‘Knightian Uncertainty’ sense and I don’t know how much it makes sense to use subjective probability calculus. It is also highly variable to what we mean by AGI though. I find many of the arguments I’ve seen to be a) deference to the subjective probabilities of others or b) extrapolation of straight lines on graphs—neither of which I find highly convincing. (I think your arguments seem stronger and more grounded fwiw)
I think from an x-risk perspective it is quite hard to beat AI risk even on pretty long timelines.
I think this can hold, but it hold’s not just in light of particular facts about AI progress now but in light of various strong philosophical beliefs about value, what future AI would be like, and how the future would be post the invention of said AI. You may have strong arguments for these, but I find many arguments for the overwhelming importance of AI Safety do very poorly to ground these, especially in the light of compelling interventions to good that exist in the world right now.
It is also highly variable to what we mean by AGI though.
I’m happy to do timelines to the singularity and operationize this with “we have the technological capacity to pretty easily build projects as impressive as a dyson sphere”.
(Or 1000x electricity production, or whatever.)
In my views, this likely adds only a moderate number of years (3-20 depending on how various details go).
For what it’s worth, Yann LeCun is very confidently against LLMs scaling to AGI, and yet LeCun seems to have at least vaguely similar timelines-to-AGI as Ajeya does in that link.
Ditto for me.
Oh hey here’s one more: Chollet himself (!!!) has vaguely similar timelines-to-AGI (source) as Ajeya does. (Actually if anything Chollet expects it a bit sooner: he says 2038-2048, Ajeya says median 2050.)
I agree with Chollet (and OP) that LLMs will probably plateau, but I’m also big into AGI safety—see e.g. my post AI doom from an LLM-plateau-ist perspective.
(When I say “AGI” I think I’m talking about the same thing that you called digital “beings” in this comment.)
Here are a bunch of agreements & disagreements.
I think François is right, but I do think that work on AI safety is overwhelmingly valuable.
Here’s an allegory:
There’s a fast-breeding species of extraordinarily competent and ambitious intelligent aliens. They can do science much much better than Einstein, they can run businesses much much better than Bezos, they can win allies and influence much much better than Hitler or Stalin, etc. And they’re almost definitely (say >>90% chance) coming to Earth sooner or later, in massive numbers that will keep inexorably growing, but we don’t know exactly when this will happen, and we also don’t know in great detail what these aliens will be like—maybe they will have callous disregard for human welfare, or maybe they’ll be great. People have been sounding the alarm for decades that this is a big friggin’ deal that warrants great care and advanced planning, but basically nobody cares.
Then some scientist Dr. S says “hey those dots in the sky—maybe they’re the aliens! If so they might arrive in the next 5-10 years, and they’ll have the following specific properties”. All of the sudden there’s a massive influx of societal interest—interest in the dots in particular, and interest in alien preparation in general.
But it turns out that Dr. S was wrong! The dots are small meteors. They might hit earth and cause minor damage but nothing unprecedented. So we’re back to not knowing when the aliens will come or what exactly they’ll be like.
Is Dr. S’s mistake “strong evidence that alien prep is not overwhelmingly valuable”? No! It just puts us back where we were before Dr. S came along.
(end of allegory)
(Glossary: the “aliens” are AGIs; the dots in the sky are LLMs; and Dr. S would be a guy saying LLMs will scale to AGI with no additional algorithmic insights.)
If LLMs will plateau (as I expect), I think there are nevertheless lots of tractable projects that would help AGI safety. Examples include:
The human brain runs some sort of algorithm to figure things out and gets things done and invent technology etc. We don’t know exactly what that algorithm is (or else we would already have AGI), but we know much more than zero about it, and it’s obviously at least possible that AGI will be based on similar algorithms. (I actually believe something stronger, i.e. that it’s likely, but of course that’s hard to prove.) So now this is a pretty specific plausible AGI scenario that we can try to plan for. And that’s my own main research interest—see Intro to Brain-Like-AGI Safety. (Other varieties of model-based reinforcement learning would be pretty similar too.) Anyway, there’s tons of work to do on planning for that.
…For example, I list seven projects here. Some of those (e.g. this) seem robustly useful regardless of how AGI will work, i.e. even if future AGI is neither brain-like nor LLM-like, but rather some yet-unknown third category of mystery algorithm.
Outreach—After the invention of AGI (again, what you called digital “beings”), there are some obvious-to-me consequences, like “obviously human extinction is on the table as a possibility”, and “obviously permanent human oversight of AGIs in the long term would be extraordinarily difficult if not impossible” and “obviously AGI safety will be hard to assess in advance” and “if humans survive, those humans will obviously not be doing important science and founding important companies given competition from trillions of much-much-more-competent AGIs, just like moody 7-year-olds are not doing important science and founding important companies today” and “obviously there will be many severe coordination problems involved in the development and use of AGI technology”. But, as obvious as those things are to me, hoo boy there sure are tons of smart prominent people who would very confidently disagree with all of those. And that seems clearly bad. So trying to gradually establish good common knowledge of basic obvious things like that, through patient outreach and pedagogy, seems robustly useful and tractable to me.
Policy—I think there are at least a couple governance and policy interventions that are robustly useful regardless of whether AGI is based on LLMs (as others expect) or not (as I expect). For example, I think there’s room for building better institutions through which current and future tech companies (and governments around the world) can cooperate on safety as AGI approaches (whenever that may happen).
For what it’s worth, Yann LeCun is very confidently against LLMs scaling to AGI, and yet LeCun seems to have at least vaguely similar timelines-to-AGI as Ajeya does in that link.
Ditto for me.
See also my discussion here (“30 years is a long time. A lot can happen. Thirty years ago, deep learning was an obscure backwater within AI, and meanwhile people would brag about how their fancy new home computer had a whopping 8 MB of RAM…”)
To be clear, you can definitely find some people in AI safety saying AGI is likely in <5 years, although Ajeya is not one of those people. This is a more extreme claim, and does seem pretty implausible unless LLMs will scale to AGI.
Yeah some examples would be:
many AI safety people seem happy to make confident guesses about what tasks the first AGIs will be better and worse at doing based on current LLM capabilities;
many AI safety people seem happy to make confident guesses about how much compute the first AGIs will require based on current LLM compute requirements;
many AI safety people seem happy to make confident guesses about which companies are likely to develop AGIs based on which companies are best at training LLMs today;
many AI safety people seem happy to make confident guesses about AGI UIs based on the particular LLM interface of “context window → output token”;
etc. etc.
Many ≠ All! But to the extent that these things happen, I’m against it, and I do complain about it regularly.
(To be clear, I’m not opposed to contingency-planning for the possibility that LLMs will scale to AGIs. I don’t expect that contingency to happen, but hey, what do I know, I’ve been wrong before, and so has Chollet. But I find that these kinds of claims above are often stated unconditionally. Or even if they’re stated conditionally, the conditionality is kinda forgotten in practice.)
I think it’s also important to note that these habits above are regrettably common among both AI pessimists and AI optimists. As examples of the latter, see me replying to Matt Barnett and me replying to Quintin Pope & Nora Belrose.
By the way, this might be overly-cynical, but I think there are some people (coming into the AI safety field very recently) who understand how LLMs work but don’t know how (for example) model-based reinforcement learning works, and so they just assume that the way LLMs work is the only possible way for any AI algorithm to work.
Hey Steven! As always I really appreciate your engagement here, and I’m going to have to really simplify but I really appreciate your links[1] and I’m definitely going to check them out 🙂
I think the most relevant disagreement that we have[2]is the beginning of your allegory. To indulge it, I don’t think we have knowledge of the intelligent alien species coming to earth, and to the extent we have a conceptual basis for them we can’t see any signs of them in the sky. Pair this with the EA concern that we should be concerned about the counterfactual impact of our actions, and that there are opportunities to do good right here and now,[3] it shouldn’t be a primary EA concern.
Now, what would make it a primary concern is if Dr S is right and that the aliens are spotted and that they’re on their way, but I don’t think he’s right. And, to stretch the analogy to breaking point, I’d be very upset that after I turned my telescope to the co-ordinates Dr S mentions and seeing meteors instead of spaceships, that significant parts of the EA movement were still wanting to have more funding to construct the ultimate-anti-alien-space-laser or do alien-defence-research instead of buying bednets.
A secondary crux I have is that a ‘digital being’ in the sense I describe, and possibly the AGI you think of, will likely exhibit certain autopoietic properties that make it significantly different from either the paperclip maxermiser or a ‘foom-ing’ ASI. This is highly speculative though, based on a lot of philosophical intuitions, and I wouldn’t want to bet humanity’s future on it at all in the case where we did see aliens in the sky.
My take on it, though I admit driven by selection bias on Twitter, is that many people in the Bay-Social-Scene are buying into the <5 year timelines. Aschenbrenner for sure, Kokotajlo as well, and even maybe Amodei[4] as well? (Edit: Also lots of prominent AI Safety Twitter accounts seem to have bought fully into this worldview, such as the awful ‘AI Safety Memes’ account) However, I do agree it’s not all of AI Safety for sure! I just don’t think it that, once you take away that urgency and certainy of the probelm, it ought to be considered the world’s “most pressing problem”, at least without further controversial philosophical assumptions.
I remember reading and liking your ‘LLM plateau-ist’ piece.
I can’t speak for all the otheres you mention, but fwiw I do agree with your frustrations at the AI risk discourse on various sides
I’d argue through increasing human flourishing and reducing the suffering we inflict on animals, but you could sub in your own cause area here for instance, e.g. ‘preventing nuclear war’ if you thought that was both likely and an x-risk
See the transcript with Dwarkesh at 00:24:26 onwards where he says that superhuman/transformative AI capabilities will come within ‘a few years’ of the interview’s date (so within a few years of summer 2023)
As in, your crux is that the probability of AGI within the next 50 years is less than 10%?
I think from an x-risk perspective it is quite hard to beat AI risk even on pretty long timelines. (Where the main question is bio risk and what you think about (likely temporary) civilizational collapse due to nuclear war.)
It’s pretty plausible that on longer timelines technical alignment/safety work looks weak relative to other stuff focused on making AI go better.
I’m essentially deeply uncertain about how to answer this question, in a true ‘Knightian Uncertainty’ sense and I don’t know how much it makes sense to use subjective probability calculus. It is also highly variable to what we mean by AGI though. I find many of the arguments I’ve seen to be a) deference to the subjective probabilities of others or b) extrapolation of straight lines on graphs—neither of which I find highly convincing. (I think your arguments seem stronger and more grounded fwiw)
I think this can hold, but it hold’s not just in light of particular facts about AI progress now but in light of various strong philosophical beliefs about value, what future AI would be like, and how the future would be post the invention of said AI. You may have strong arguments for these, but I find many arguments for the overwhelming importance of AI Safety do very poorly to ground these, especially in the light of compelling interventions to good that exist in the world right now.
I’m happy to do timelines to the singularity and operationize this with “we have the technological capacity to pretty easily build projects as impressive as a dyson sphere”.
(Or 1000x electricity production, or whatever.)
In my views, this likely adds only a moderate number of years (3-20 depending on how various details go).
Oh hey here’s one more: Chollet himself (!!!) has vaguely similar timelines-to-AGI (source) as Ajeya does. (Actually if anything Chollet expects it a bit sooner: he says 2038-2048, Ajeya says median 2050.)