Superforecaster, former philosophy PhD, Giving What We Can member since 2012. Currently trying to get into AI governance.
David Mathersđ¸
Not everything being funded here even IS alignment techniques, but also, insofar as you just want general better understanding of AI as a domain through science, why wouldnât you learn useful stuff from applying techniques to current models. If the claim is that current models are too different from any possible AGI for this info to be useful, why do you think âdo scienceâ would help prepare for AGI at all? Assuming you do think that, which still seems unclear to me.
I asked about genuine research creativity not AGI, but I donât think this conversation is going anywhere at this point. It seems obvious to me that âdoes stuff mathematicians say makes up the building blocks of real researchâ is meaningful evidence that the chance that models will do research level maths in the near future is not ultra-low, given that capabilities do increase with time. I donât think this analogous to IQ tests or the bar exam, and for other benchmarks, I would really need to see what your claiming is the equivalent of the transfer from frontier math 4 to real math that was intuitive but failed.
The forum is kind of a bit dead generally, for one thing.
I donât really get on what grounds your are saying that the Coefficient Grants are not to people to do science, apart from the governance ones. I also think you are switching back and forth between: âNo one knows when AGI will arrive, best way to prepare just in case is more normal AI scienceâ and âwe know that AGI is far, so thereâs no point doing normal science to prepare against AGI now, although there might be other reasons to do normal science.â
I guess I still just want to ask: If models hit 80% on frontier math by like June 2027, how much does that change your opinion on whether models will be capable of âgenuine creativityâ in at least one domain by 2033. Iâm not asking for an exact figure, just a ballpark guess. If the answer is âhardly at allâ, is there anything short of an 100% clear example of a novel publishable research insight in some domain, that would change your opinion on when âreal creativityâ will arrive?
I think what you are saying here is mostly reasonable, even if I am not sure how much I agree: it seems to turn on very complicated issue in the philosophy of probability/âdecision theory, and what you should do when accurate prediction is hard, and exactly how bad predictions have to be to be valueless. Having said that, I donât think your going to succeed in steering conversation away from forecasts if you keep writing about how unlikely it is that AGI will arrive near term. Which you have done a lot, right?
Iâm genuinely not sure how much EA funding for AI-related stuff even is wasted on your view. To a first approximation, EA is what Moskowitz and Tuna fund. When I look at Coefficientâs-i.e. what previously was Open Philâs-7 most recent AI safety and governance grants hereâs what I find:
1) A joint project of METR and RAND to develop new ways of assessing AI systems for risky capabilities.
2) âAI safety workshop field buildingâ by BlueDot Impact
3) An AI governance workshop at ICML
4) âGeneral supportâ for the Center for Governance of AI.
5) A âstudy on encoded reasoning in LLMs at the University of Marylandâ
6) âResearch on misalignmentâ here: https://ââwww.meridiancambridge.org/ââlabs
7) âSecure Enclaves for LLM Evaluationâ here https://ââopenmined.org/ââ
So is this stuff bad or good on the worldview youâve just described? I have no idea, basically. None of it is forecasting, plausibly it all broadly falls under either empirical research on current and very near future models, training new researchers, or governance stuff, though that depends on what âresearch on misalignmentâ means. But of course, youâd only endorse if it is good research. If you are worried about lack of academic credibility specifically, as far as I can tell 7 out of the 20 most recent grants are to academic research in universities. It does seem pretty obvious to me that significant ML research goes on at places other than universities, though, not least the frontier labs themselves.
I guess I feel like if being able to solve mathematical problems designed by research mathematicians to be similar to the kind of problems they solve in their actual work is not decent evidence that AIs are on track to be able to do original research in mathematics in less than say 8 years then what would you EVER accept as empirical evidence that we are on track for that, but not there yet?
Note that I am not saying this should push your overall confidence to over 50% or anything, just that it ought to move you up by a non-trivial amount relative to whatever your credence was before. I am certainly NOT saying that skill on Frontier Math 4 will inevitably transfer to real research mathematics, just that you should think there is a substantial risk that it will.
I am not persuaded by the analogy to IQ test scores for the following reason. It is far from clear that the tasks that LLMs canât do despite scoring 100 on IQ tests are anything like as similar as the Frontier Math 4 tasks are at least allegedly designed to resemble real research questions in mathematics*, because the latter are being deliberately designed for similarity, whereas IQ tests are just designed so that skill on them correlates with skill on intellectual tasks in general among humans. (I also think the inference towards âthey will be able to DO research mathâ, from progress on Frontier Math 4, is rather less shaky than âthey will DO proper research math in the same way as humansâ. Itâs not clear to me what tasks actually require âreal creativityâ if that means a particular reasoning style, rather than just the production of novel insights as an end product. I donât think you or anyone else knows this either.) Real math is also uniquely suited to questions-and-answer benchmarks I think, because things really are often posed as extremely well-defined problems with determinate answers, i.e. prove X. Proving things is not literally the only skill mathematicians have, but being able to prove the right stuff is enough to be making a real contribution. In my view that makes claims for construct validity here much more plausible than say, inferring Chat-GTP can be a lawyer if it passes the bar exam.
In general, your argument here seems like it could be deployed against literally any empirical evidence that AIs were approaching being able to do a task, short of them actually performing that task. You can always say âjust because in humans, ability to do X is correlated with ability to do Y, doesnât mean the techniques the models are using to do X can do Y with a bit of improvement.â And yes, that is always true, that it doesnât *automatically* mean that. But if you allow this to mean that no success on any task ever significantly moves you at all about future real world progress on intuitively similar but harder tasks, you are basically saying it is impossible to get empirical evidence that progress is coming before it has arrived, which is just pretty suspicious a priori. What you should do in my view, is think carefully about the construct validity of the particular benchmark in question, and then-roughly-updated your view based on how likely you think it is to be basically valid, and what it would mean if it was. You should take into account the risk that success on Frontier Math 4 is giving real signal, not just the risk that it is meaningless.
My personal guess is that it is somewhat meaningful, and we will see the first real AI contributions to maths in 6-7 years, that is 60% chance by then of AI proofs important enough for credible mid-ranking journals. EDIT: I forgot my own forecast here, I expect saturation in about 5 years so âseveralâ years is an exaggeration. Nonetheless I expect some gap between Frontier Math 4 being saturated and the first real contribuitions to research mathematics: I guess 6-9 years until real contributions is more like my forecast than 6-7To be clear, I say âsomewhatâ because this is several years after I expect the benchmark itself to saturate.But I am not shocked if someone thinks âno, it is more likely to be meaninglessâ. But I do think if your going to make a strong version of the âitâs meaninglessâ case where you donât see the results as signal to any non-negligible degree, you need more than to just say âsome other benchmarks in far less formal demains, apparently far less similar to the real world tasks being measured, have low construct validity.â
In your view, is it possible to design a benchmark that a) does not literally amount to âproduce a novel important proofâ, but b) nonetheless improvements on the benchmark give decent evidence that we are moving towards models being able to do this? If it is possible, how would it differ from Frontier Math 4?
*I am prepared to change my mind on this if a bunch of mathematicians say âno, actually the questions donât look like they were optimized for this.â
âRob Wiblin opines that the fertility crash would be a global priority if not for AI likely replacing human labor soon and obviating the need for countries to have large human populationsâ
This is a case where it really matters whether you are giving an extremely high chance that AGI is coming within 20-30 years, or merely a decently high chance. If you think the chance is like 75%, and the claim that conditional on no AGI, low fertility would be a big problem is correct, then the problem is only cut by 4x, which is compatible with it still being large and worth working on. Really, you need to get above 97-8% before it starts looking clear that low fertility is not worth worrying about, if we assume that conditional on no AGI it will be a big problem.
Iâm not actually that interested in defending:
The personal honor of Yudkowsky, who Iâve barely read and donât much like, or his influence on other peopleâs intellectual style. I am not a rationalist, though Iâve met some impressive people who probably are.
The specific judgment calls and arguments made in AI 2027.
Using the METR graph to forecast superhuman coders (even if I probably do think this is MORE reasonable than you do; but Iâm not super-confident about its validity as a measure of real-world coding. But I was not trying to describe how I personally would forecast superhuman coders, but just to give a hypothetical case where making a forecast more âsubjectiveâ plausibly improves it.)
Rather what I took myself to be saying was:
Judgmental forecasting is not particularly a LW thing, and it is what AI2027 was doing, whether or not they were doing it well.
You canât really avoid what you are calling âsubjectivityâ when doing judgmental forecasting, at least if that means not just projecting a trend in data and having done with it, but instead letting qualitative considerations effect the final number you give.
Sometimes it would clearly make a forecast better to make it more âsubjectiveâ if that just means less driven only by a projection of a trend in data into the future.
In predicting a low chance of AGI in the near term, you are also just making an informed guess influenced by data but also by qualitative considerations, argumemt, gut instinct etc. At that level of description, your forecast is just as âmade upâ as AI2027. (But of course this is completely compatible with the claim that some of AI2027â˛s specific guesses are not well-justified enough or implausible.)
Now, it may be that forecasting is useless here, because no one can predict how technology will develop five years out. But Iâm pretty comfortable saying that if THAT is your view, then you really shouldnât also be super-confident the chance of near-term AGI is low. Though I do think saying âthis just canât be forecasted reliablyâ on its own is consistent with criticizing people who are confident AGI is near.
My thought process didnât go beyond âYarrow seems committed to a very low chance of AI having real, creative research insights in the next few years, here is something that puts some pressure on thatâ. Obviously I agree that when AGI will arrive is a different question from when models will have real insights in research mathematics. Nonetheless I got the feeling-maybe incorrectly, that your strength of conviction that AGI is partly based on things like âmodels in the current paradigm canât have âreal insightââ, so it seemed relevant, even though âreal insight in maths is probably coming soon, but AGI likely over 20 years awayâ is perfectly coherent, and indeed close to my own view.
Anyway, why canât you just answer my question?
Working on AI isnât the same as doing EA work on AI to reduce X-risk. Most people working in AI are just trying to make the AI more capable and reliable. There probably is a case for saying that âmore reliableâ is actually EA X-risk work in disguise, even if unintentionally, but itâs definitely not obvious this is true.
âAny sort of significant credible evidence of a major increase in AI capabilities, such as LLMs being able to autonomously and independently come up with new correct ideas in science, technology, engineering, medicine, philosophy, economics, psychology, etcâ
Just in the spirit of pinning people to concrete claims: would you count progress on Frontier Math 4, like say, models hitting 40%*, as being evidence that this is not so far off for mathematics specifically? (To be clear, I think it is very easy to imagine models that are doing genuinely significant research maths but still canât reliably be a personal assistant, so I am not saying this is strong evidence of near-term AGI or anything like that.) Frontier Math Tier 4 questions allegedly require some degree of ârealâ mathematical creativity and were designed by actual research mathematicians-including in some cases Terry Tao EDIT: that is he supplied some Frontier Math questions, not sure if any were Tier 4, so weâre not talking cranks here. Epoch claim some of the problems can take experts weeks. If you wouldnât count this as evidence that genuine AI contributions to research mathematics might not be more than 6-7 years off, what, if anything would you count as evidence of that? If you donât like Frontier Math Tier 4 as an early warning sign is that because:
1) You think itâs not really true that the problems require real creativity, and you donât think âuncreativeâ ways of solving them will ever get you to being able to do actual research mathematics that could get in good journals.
2) You just donât trust models not to be trained on the test set because there was a scandal about Open AI having access to the answers. (Though as Iâve said, current state of the art is a Google model).
3) 40% is too low, something like 90% would be needed for a real early warning sign.
4) In principle, this would be a good early warning sign if for all we knew RL scaling could continue for many more orders of magnitude, but since we know it canât continue for more than a few, it isnât because by the time your hitting a high level on Frontier Math 4, your hitting the limits of RL-scaling and canât improve further
Of course, maybe you think the metric is fine, but you just expect progress to stall well before scores are high enough to be an early warning sign of real mathematical creativity, because of limits to RL-scaling?
*Current best is some version of Gemini at 18%.
Yeah, itâs fair objection that even answer the why question like I did presupposes that EAs are wrong, or at least, merely luckily right. (I think this is a matter of degree, and that EAs overrated the imminence of AGI and the risk of takeover on average, but itâs still at least reasonable to believe AI safety and governance work can have very high expected value for roughly the reasons EAs do.) But I was responding to Yarrow who does think that EAs are just totally wrong, so I guess really I was saying that âconditional on a sociological explanation being appropriate, I donât think itâs as LW-driven as Yarrow thinksâ, although LW is undoubtedly important.)
Can you say more about what makes something âa subjective guessâ for you? When you say well under 0.05% chance of AGI in 10 years, is that a subjective guess?
Like, suppose I am asked, as a pro-forecaster, to say whether the US will invade Syria, after a US military build-up involving air craft carriers in the Eastern Med, and I look for newspaper reports of signs of this, look up the base rate of how often the US bluffs with a military build up rather than invading, and then make a guess as to how likely an invasion is, is that âa subjective guessâ. Or am I relying on data? What about if I am doing what AI 2027 did and trying to predict when LLMs match human coding ability on the basis of current data. Suppose I use the METR data like they did, and I do the following. I assume that if AIs are genuinely able to complete 90% of real world tasks that take human coders 6 months, then they are likely as good at coding as humans. I project the METR data out to find a date for when we will hit 6-months tasks, theoretically if the trend continues. But then, instead of stopping, and saying that is my forecast, I remember that benchmark performance is generally a bit misleading in terms of real-world competence, and remember METR found that AIs often couldnât complete more realistic versions of the tasks which the benchmark counted them as passing. (Couldnât find a source for this claim, but I remember seeing it somewhere.) I decide maybe when models will hit real world 6-month task 90% completion rate should maybe be a couple more doubling times of the 90 time-horizon METR metric forward. I move my forecast for human-level coders to, say, 15 months after the original to reflect this. Am I making a subjective guess, or relying on data? When I made the adjustment to reflect issues about construct validity, did that make my forecast more subjective? If so, did it make it worse, or did it make it better? I would say better, and I think youâd probably agree, even if you still think the forecast is bad.
This geopolitical example here is not particularly hypothetical. I genuinely get paid to do this for Good Judgment, and not ONLY by EA orgs, although often it is by them. We donât know who the clients are, but some questions have been clearly commercial in nature and of zero EA interest.Iâm not particular offended* if you think this kind of âget allegedly expert forecasters, rather than or as well as domain experts to predict stuffâ is nonsense. I do it because people pay me and itâs great fun, rather than because I have seriously investigated itâs value. But what I do disagree with the idea that this is distinctively a Less Wrong rationalist thing. Thereâs a whole history of relatively well-known work on it by the American political scientists Philip Tetlock that I think began when Yudkowsky was literally still a child. Itâs out of that work that Good Judgment, that org for which I work as a forecaster comes, not anything to do with Less Wrong. Itâs true that LessWrong rationalists are often enthusiastic about it, but thatâs not all that interesting on its own. (In general many Yudkowskian ideas actually seem derived from quite mainstream sources on rationality and decision-making to me. I would not reject them just because you donât like what LW does with them. Bayesian epistemology is a real research program in philosophy for example.)
*Or at least, I am trying my best not to be offended, because I shouldnât be, but of course I am human and objectivity about something I derive status and employment from is hard. Though I did have a cool conversation at the least EAG London with a very good forecaster who thought it was terrible Open Phil put money into forecasting because it just wasnât very useful or important.
I donât think EAs AI focus is a product only of interaction with Less Wrong,-not claiming you said otherwise-but I do think people outside the Less Wrong bubble tend to be less confident AGI is imminent, and in that sense less âcautiousâ.
I think EAs AI focus is largely a product of the fact that Nick Bostrom knew Will and Toby when they were founding EA, and was a big influence on their ideas. Of course, to some degree this might be indirect influence from Yudkowsky since he was always interacting with Nick Bostrom, but itâs hard to know in what direction the influenced flowed here. I was around in Oxford during the embryonic stages of EA, and while I was not involved-beyond being a GWWC member, I did have the odd conversation with people who were involved, and my memory is that even then, people were talking about X-risk from AI as a serious contender for the best cause area, as early as at least 2014, and maybe a bit before that. They -EDIT: by âtheyâ here I mean, âsome people in Oxford, I donât remember whoâ; donât know when Will and Toby specifically first interacted with LW folk-were involved in discussion with LW people, but I donât think they got the idea FROM LW. Seems more likely to me they got it from Bostrom and the Future of Humanity Institute, who were just down the corridor.
What is true is that Oxford people have genuinely expressed much more caution about timelines. I.e. in What We Owe the Future, published as late as 2022, Will is still talking about how AGI might be more than 50 years, away but also âit might come soon-within the next fifty or even twenty years.â (If youâre wondering what evidence he cites, itâs the Cotra bioanchors report.) His discussion primarily emphasizes uncertainty about exactly when AGI will arrive, and how we canât be confident itâs not close. He cites a figure from an Open Phil report guessing an 8% chance of AGI by 2036*. I know youâre view is that this is all wildly wrong still, but itâs quite different from what many-not all-Less Wrong people say, who tend to regard 20 years as a long time line. (Maybe Will has updated to shorter timelines since of course.)
I think there is something of a divide between people who believe strongly in a particular set of LessWrong derived ideas about the imminence of AGI, and another set of people who are mainly driven by something like âwe should take positive EV bets with a small chance of paying off, and doing AI stuff just in case AGI arrives soonâ. Defending the point about taking positive EV bets with only a small chance of pay-off is what a huge amount of the academic work on Longtermism at the GPI in Oxford was about. (This stuff definitely has been subjected to-severe-levels of peer reviewed scrutiny, as it keeps showing up in top philosophy journals with rejection rates of like, 90%.)
*This is more evidence people were prepared to bet big on AI risk long before the idea that AGI is actually imminent became as popular as it is now. I think people just rejected the idea that useful work could only be done when AGI was definitely near, and we had near-AGI models.
People vary a lot in how they interpret terms like âunlikelyâ or âvery unlikelyâ in % terms, so I think >10% is not all that obvious. But I agree that it is evidence they donât think the whole idea is totally stupid, and that a relatively low probability of near-term AGI is still extremely worth worrying about.
I donât think itâs clear, absent further argument, that there has to be a 10% chance of full AGI in the relatively near future to justify the currently high valuations of tech stocks. New, more powerful models could be super-valuable without being able to do all human labour. (For example, if they werenât so useful working alone, but they made human workers in most white collar occupations much more productive.) And you havenât actually provided evidence that most experts think thereâs a 10% chance current paradigm will lead to AGI. Though the latter point is a bit of a nitpick if 24% of experts think it will, since I agree the latter is likely enough to justify EA money/âconcern. (Maybe the survey had some donât knows though?).
âi donât believe very small animals feel pain, and if they do my best guess would be it would be thousands to millions orders of magnitude less pain than larger animals.â
Iâll repeat what regular readers of the forum are bored of me saying about this. As a philosophy of consciousness PhD, I barely ever heard the idea that small animals are conscious, but their experiences are way less intense. At most, it might be a consequence of integrated information theory, but not one I ever saw discussed and most people in the field donât endorse that one theory anyway. I cannot think of any other theory which implies this, or any philosophy of mind reason to think it is so. It seems very suspiciously like it is just something EAs say to avoid commitments to prioritizing tiny animals that seem a bit mad. Even if we take seriously the feeling that those commitments are a bit mad, there are any number of reasons that could be true apart from âsmall conscious brains have proportionally less intense experiences than large conscious brains.â The whole idea also smacks to me of the idea that pain is literally a substance, like water or sand that the brain somehow âmakesâ using neurons as an ingredient, in the way that combining two chemicals might make a third via a reaction, and how much of the product you get out depends on how much you put in. On mind-body dualist views this picture might make some kind of surface sense, though itâll get a bit complicated once you start thinking about the possibility of conscious aliens without neurons. But on more popular physicalist views of consciousness, this picture is just wrong: conscious pain is not stuff that the brain makes.
Nor does it particularly seem âcommonsenseâ to me. A dog has a somewhat smaller brain than a human, but I donât think most people think that their dog CAN feel pain, but it feels somewhat less pain than it appears, because itâs brain is a bit smaller than a persons. Of course, it could be intensity is the same once you hit a certain brain size no matter how much you then scale up, but it starts to dop off proportionately when you hit a certain level of smallness, but that seems pretty ad hoc.
I think when people say it is rapidly decreasing they may often mean that the the % of the worldâs population living in extreme poverty is declining over time, rather than that the total number of people living in extreme poverty is going down?
I think when people say it is rapidly decreasing they may often mean that the the % of the worldâs population living in extreme poverty is declining over time, rather than that the total number of people living in extreme poverty is going down?
What have EA funders done thatâs upset you?