Research analyst at Open Philanthropy. All opinions are my own.
Lukas Finnveden
It’s the crux between you and Ajeya, because you’re relatively more in agreement on the other numbers. But I think that adopting the xpt numbers on these other variables would slow down your own timelines notably, because of the almost complete lack of increase in spending.
That said, if the forecasters agreed with your compute requirements, they would probably also forecast higher spending.
in terms of saving “disability-adjusted life years” or DALYs, “a case of HIV/AIDS can be prevented for $11, and a DALY gained for $1” by improving the safety of blood transfusions and distributing condoms
These numbers are wild compared to eg current givewell numbers. My guess would be that they’re wrong, and if so, that this was a big part of why PEPFAR did comparatively better then expected. Or maybe that they were significantly less scalable (measured in cost of marginal life saved as a function of lives saved so far) than PEPFAR.
If the numbers were right, and you could save more lives than PEPFAR for 100x less money (or 30x (?) less after taking into account some falls in cost), I’m not sure I buy that the political feasibility of PEPFAR was greater than the much cheaper ask (a priori). At least I get very sympathetic to the then-economists.
(But again, I’d guess those numbers were probably wrong or unscalable?)
Nice, gotcha.
Incidentally, as its central estimate for algorithmic improvement, the takeoff speeds model uses AI and Efficiency’s ~1.7x per year, and then halves it to ~1.3x per year (because todays’ algorithmic progress might not generalize to TAI). If you’re at 2x per year, then you should maybe increase the “returns to software” from 1.25 to ~3.5, which would cut the model’s timelines by something like 3 years. (More on longer timelines, less on shorter timelines.)
Yeah sorry, I didn’t mean to say this directly contradicted anything you said. It just felt like a good reference that might be helpful to you or other people reading the thread. (In retrospect, I should have said that and/or linked it in response to the mention in your top-level comment instead.)
(Also, personally, I do care about how much effort and selection is required to find good retrodictions like this, so in my book “I didn’t look up the data on Google beforehand” is relevant info. But it would have been way more impressive if someone had been able to pull that off in 1890, and I agree this shouldn’t be confused for that.)
Re “it was incorrect by an order of magnitude”: that seems fine to me. If we could get that sort of precision for predicting TAI, that would be awesome and outperform any other prediction method I know about.
and notably there’s been perhaps a 2x speedup in algorithmic progress since 2022
I don’t understand this. Why would there be a 2x speedup in algorithmic progress?
And, as I think Eliezer said (roughly), there don’t seem to be many cases where new tech was predicted based on when some low-level metric would exceed the analogous metric in a biological system. [...] And the way in which machines perform tasks usually looks very different than how biological systems do it (bird vs. airplanes, etc.).
From Birds, Brains, Planes, and AI:
This data shows that Shorty [hypothetical character introduced earlier in the post] was entirely correct about forecasting heavier-than-air flight. (For details about the data, see appendix.) Whether Shorty will also be correct about forecasting TAI remains to be seen.
In some sense, Shorty has already made two successful predictions: I started writing this argument before having any of this data; I just had an intuition that power-to-weight is the key variable for flight and that therefore we probably got flying machines shortly after having comparable power-to-weight as bird muscle. Halfway through the first draft, I googled and confirmed that yes, the Wright Flyer’s motor was close to bird muscle in power-to-weight. Then, while writing the second draft, I hired an RA, Amogh Nanjajjar, to collect more data and build this graph. As expected, there was a trend of power-to-weight improving over time, with flight happening right around the time bird-muscle parity was reached.
I think my biggest disagreement with the takeoff speeds model is just that it’s conditional on things like: no coordinated delays, regulation, or exogenous events like war, and doesn’t take into account model uncertainty.
Cool, I thought that was most of the explanation for the difference in the median. But I thought it shouldn’t be enough to explain the 14x difference between 28% and 2% by 2030, because I think there should be a ≥20% chance that there are no significant coordinated delays, regulation, or relevant exogenous events if AI goes wild in the next 7 years. (And that model uncertainty should work to increase rather than decrease the probability, here.)
If you think robotics would definitely be necessary, then I can see how that would be significant.
But I think it’s possible that we get a software-only singularity. Or more broadly, simultaneously having (i) AI improving algorithms (...improving AIs), (ii) a large fraction of the world’s fab-capacity redirected to AI chips, and (iii) AIs helping with late-stage hardware stuff like chip-design. (I agree that it takes a long time to build new fabs.) This would simultaneously explain why robotics aren’t necessary (before we have crazy good AI) and decrease the probability of regulatory delays, since the AIs would just need to be deployed inside a few companies. (I can see how regulation would by-default slow down some kinds of broad deployment, but it seems super unclear whether there will be regulation put in place to slow down R&D and internal deployment.)
My own distribution over the training FLOP for transformative AI is centered around ~10^32 FLOP using 2023 algorithms, with a standard deviation of about 3 OOM.
Thanks for the numbers!
For comparison, takeoffspeeds.com has an aggressive monte-carlo (with a median of 10^31 training FLOP) that yields a median of 2033.7 for 100% automation — and a p(TAI < 2030) of ~28%. That 28% is pretty radically different from your 2%. Do you know your biggest disagreements with that model?
The 1 OOM difference in training FLOP presumably doesn’t explain that much. (Although maybe it’s more, because takeoffspeeds.com talks about “AGI” and you talk about “TAI”. On the other hand, maybe your bar for “transformative” is lower than 100% automation.)
Some related responses to stuff in your post:
The most likely cause of such a sudden acceleration seems to be that pre-superintelligent systems could accelerate technological progress. But, as I have just argued above, a rapid general acceleration of technological progress from pre-superintelligent AI seems very unlikely in the next few years.
You argued that AI labor would be small in comparison to all of human labor, if we got really good software in the next 4 years. But if we had recently gotten such insane gains in ML-capabilities, people would want to vastly increase investment in ML-research (and hardware production) relative to everything else in the world. Normally, labor spent on ML research would lag behind, because it takes a long time to teach a large number of humans the requisite skills. But for each skill, you’d only need to figure out how to teach AI about it once, and then all 10 million AIs would be able to do it. (There would certainly be some lag, here, too. Your posts says “lag for AI will likely be more than a year”, which I’m sympathetic to, but there’s time for that.)
When I google “total number of ml researchers”, the largest number I see is 300k and I think the real answer is <100k. So I don’t think a huge acceleration in AI-relevant technological progress before 2030 is out of the question.
(I think it’s plausible we should actually be thinking about the best ML researchers rather than just counting up the total number. But I don’t think it’d be crazy for AIs to meet that bar in the hypothetical you paint. Given the parallelizability of AI, it’s both the case that (i) it’s worth spending much more effort on teaching skills to AIs, and (ii) that it’s possible for AIs to spend much more effective time on learning.)
I am also inclined to cut some probability away from short timelines given the lack of impressive progress in general-purpose robotics so far, which seems like an important consideration given that the majority of labor in the world currently requires a physical component.
Mostly not ML research.
Also, if the AIs are bottlenecked by motor skills, humans can do that part. When automating small parts of the total economy (like ML research or hardware production), there’s room to get more humans into those industries to do all the necessary physical tasks. (And at the point when AI cognitive output is large compared to the entire human workforce, you can get a big boost in total world output by having humans switch into just doing manual labor, directed by AIs.)
However, my unconditional view is somewhat different. After considering all potential delays (including regulation, which I think is likely to be substantial) and model uncertainty, my overall median TAI timeline is somewhere between 20-30 years from now, with a long tail extending many decades into the future.
I can see how stuff like regulation would feature in many worlds, but it seems high variance and like it should allow for a significant probability of ~no delay.
Also, my intuition is that 2% is small enough in the relevant context that model uncertainty should push it up rather than down.
- Jun 14, 2023, 9:35 AM; 3 points) 's comment on A compute-based framework for thinking about the future of AI by (
The quote continues:
Of the remaining 5 %, around 70 % would eventually be reached by other civilisations, while 30 % would have remained empty in our absence.
I think the 70%/30% numbers are the relevant ones for comparing human colonization vs. extinction vs. misaligned AGI colonization. (Since 5% cuts the importance of everything equally.)
...assuming defensive dominance in space, where you get to keep space that you acquire first. I don’t know what happens without that.
This would suggest that if we’re indifferent between space being totally uncolonized and being colonized by a certain misaligned AGI and if we’re indifferent between aliens and humans colonizing space: then preventing that AGI is ~3x as good as preventing extinction.
If we value aliens less than humans, it’s less. If we value the AGI positively, it’s also less. If we value the AGI negatively, it’d be more.
If AGI systems had goals that were cleanly separated from the rest of their cognition, such that they could learn and self-improve without risking any value drift (as long as the values-file wasn’t modified), then there’s a straightforward argument that you could stabilise and preserve that system’s goals by just storing the values-file with enough redundancy and digital error correction.
So this would make section 6 mostly irrelevant. But I think most other sections remain relevant, insofar as people weren’t already convinced that being able to build stable AGI systems would enable world-wide lock-in.
Therefore, it seems to me that most of your doc assumes we’re in this scenario [without clean separation between values and other parts]?
I was mostly imagining this scenario as I was writing, so when relevant, examples/terminology/arguments will be taylored for that, yeah.
I really like the proposed calibration game! One thing I’m curious about is whether real-world evidence more often looks like a likelihood ratio or like something else (e.g. pointing towards a specific probability being correct). Maybe you could see this from the structure of priors+likelihoodratios+posteriors in the calibration game — e.g. check whether the long-run top-scorers likelihood ratios correlated more or less than their posterior probabilities.
(If someone wanted to build this: one option would be to start with pastcasting and then give archived articles or wikipedia pages as evidence. Maybe a sophisticated version could let you start out with an old relevant wikipedia page, and then see a wikipedia page much closer to the resolution date as extra evidence.)
And it would probably be a huge mistake to seek out an adderall prescription.
...unless you have other reasons to believe that an Adderall prescription might be good for you. Saliently: if you have adhd symptoms.
Depends on how much of their data they’d have to back up like this. If every bit ever produced or operated on instead had to be be 25 bits — that seems like a big fitness hit. But if they’re only this paranoid about a few crucial files (e.g. the minds of a few decision-makers), then that’s cheap.
And there’s another question about how much stability contributes to fitness. In humans, cancer tends to not be great for fitness. Analogously, it’s possible that most random errors in future civilizations would look less like slowly corrupting values and more like a coordinated whole splintering into squabbling factions that can easily be conquered by a unified enemy. If so, you might think that an institution that cared about stopping value-drift and an institution that didn’t would both have a similarly large interest in preventing random errors.
Also, by the same token, even if there is a “singleton” at some relatively early time, mightn’t it prefer to take on a non-negligible risk of value drift later in time if it means being able to, say, 10x its effective storage capacity in the meantime?
The counter-argument is that it will be super rich regardless, so it seems like satiable value systems would be happy to spend a lot on preventing really bad events from happening with small probability. Whereas instabiable value systems would notice that most resources are in the cosmos, and so also be obsessed with avoiding unwanted value drift. But yeah, if the values contain a pure time preference, and/or doesn’t care that much about the most probable types of value drift, then it’s possible that they wouldn’t deem the investment worth it.
This is a great question. I think the answer depends on the type of storage you’re doing.
If you have a totally static lump of data that you want to encode in a harddrive and not touch for a billion years, I think the challenge is mostly in designing a type of storage unit that won’t age. Digital error correction won’t help if your whole magnetism-based harddrive loses its magnetism. I’m not sure how hard this is.
But I think more realistically, you want to use a type of hardware that you regularly use, regularly service, and where you can copy the information to a new harddrive when one is about to fail. So I’ll answer the question in that context.
As an error rate, let’s use the failure rate of 3.7e-9 per byte per month ~= 1.5e-11 per bit per day from this stack overflow reply. (It’s for RAM, which I think is more volatile than e.g. SSD storage, and certainly not optimised for stability, so you could probably get that down a lot.)
Let’s use the following as an error correction method: Each bit is represented by N bits; for any computation the computer does, it will use the majority vote of the N bits; and once per day,[1] each bit is reset to the majority vote of its group of bits.
If so...
for N=1, the probability that a bit is stable for 1e9 years is ~exp(-1.5e-11*365*1e9)=0.4%. Yikes!
for N=3, the probability that 2 bit flips happen in a single day is ~3*(1.5e-11)^2 and so the probability that a group of bits is stable for 1e9 years is ~exp(-3*(1.5e-11)^2*365*1e9)=1-2e-10. Much better, but there will probably still be a million errors in that petabyte of data.
for N=5, the probability that 3 bit flips happen in a single day is ~(5 choose 2)*(1.5e-11)^3 and so the probability that the whole petabyte of data is safe for 1e9 years is ~99.99%. And so on this scheme, it seems that 5 petabytes of storage is enough to make 1 petabyte stable for a billion years.
Based on the discussion here, I think the errors in doing the majority-voting calculations are negligible compared to the cosmic ray calculations. At least if you do it cleverly so that you don’t get too many correlations and ruin your redundance (which there are ways to do according to results on error correcting computations — though I’m not sure if they might require some fixed amount of extra storage space to do this, in which case you might need N somewhat greater than 5).
Now this scheme requires that you have a functioning civilization that can provide electricity for the computer, that can replace the hardware when it starts failing, and stuff — but that’s all things that we wanted to have anyway. And any essential component of that civilization can run on similarly error-corrected hardware.
And to account for larger-scale problems than cosmic rays (e.g. local earthquake throws harddrive to the ground and shatters it, or you accidentally erase a file when you were supposed to make a copy of it), you’d probably want backup copies of the petabyte on different places across the Earth, which you replaced each time something happened to one of them. If there’s an 0.1% chance of that happening in any one day (corresponding to once/3 years, which seems like an overestimate if you’re careful), and you immediately notice it and replace the copy within a day, and you have 5 copies in total, the probability that one of them keeps working at all times is ~exp(-(0.001)^5*365*1e9)~=99.96%. So combined with the previous 5, that’d be a multiple of 5*5=25.
This felt enlightening. I’ll add a link to this comment from the doc.
- ^
Using a day here rather than an hour or a month isn’t super-motivated. If you reset things very frequently, you might interfere with normal use of the computer, and errors in the resetting-operation might start to dominate the errors from cosmic rays. But I think a day should be above the threshold where that’s much of an issue.
I’m not sure how literally you mean “disprove”, but at it’s face, “assume nothing is related to anything until you have proven otherwise” is a reasoning procedure that will never recommend any action in the real world, because we never get that kind of certainty. When humans try to achieve results in the real world, heuristics, informal arguments, and looking at what seems to have worked ok in the past are unavoidable.
Global poverty probably have slower diminishing marginal returns, yeah. Unsure about animal welfare. I was mostly thinking about longtermist causes.
Re 80,000 Hours: I don’t know exactly what they’ve argued, but I think “very valuable” is compatible with logarithmic returns. There are also diminishing marginal returns to direct workers in any given cause, so logarithmic returns on money doesn’t mean that money becomes unimportant compared to people, or anything like that.
Because utility and integrity are wholly independent variables, so there is no reason for us to assume a priori that they will always correlate perfectly. So if we wish to believe that integrity and expected value correlated for SBF, then we must show it. We must actually do the math.
This feels a bit unfair when people (i) have argued that utility and integrity will correlate strongly in practical cases (why use “perfectly” as your bar?), and (ii) that they will do so in ways that will be easy to underestimate if you just “do the math”.
You might think they’re mistaken, but some of the arguments do specifically talk about why the “assume 0 correlation and do the math”-approach works poorly, so if you disagree it’d be nice if you addressed that directly.
Because a double-or-nothing coin-flip scales; it doesn’t stop having high EV when we start dealing with big bucks.
Risky bets aren’t themselves objectionable in the way that fraud is, but to just address this point narrowly: Realistic estimates puts risky bets at much worse EV when you control a large fraction of the altruistic pool of money. I think a decent first approximation is that EA’s impact scales with the logarithm of its wealth. If you’re gambling a small amount of money, that means you should be ~indifferent to 50⁄50 double or nothing (note that even in this case it doesn’t have positive EV). But if you’re gambling with the majority of wealth that’s predictably committed to EA causes, you should be much more scared about risky bets.
(Also in this case the downside isn’t “nothing” — it’s much worse.)
conflicts of interest in grant allocation, work place appointments should be avoided
Worth flagging: Since there are more men than women in EA, I would expect a greater fraction of EA women than EA men to be in relationships with other EAs. (And trying to think of examples off the top of my head supports that theory.) If this is right, the policy “don’t appoint people for jobs where they will have conflicts of interest” would systematically disadvantage women.
(By contrast, considering who you’re already in a work-relationship with when choosing who to date wouldn’t have a systematic effect like that.)
My inclination here would be to (as much as possible) avoid having partners make grant/job-appointment decisions about their partners. But that if someone seems to be the best for a job/grant (from the perspective of people who aren’t their partner), to not deny them that just because it would put them in a position closer to their partner.
(It’s possible that this is in line with what you meant.)
FWIW you can see more information, including some of the reasoning, on page 655 (# written on pdf) / 659 (# according to page searcher) of the report. (H/t Isabel.) See also page 214 for the definition of the question.
Some tidbits:
Experts started out much higher than superforecasters, but updated downwards after discussion. Superforecasters updated a bit upward, but less:
(Those are billions on the y-axis.)
This was surprising to me. I think the experts’ predictions look too low even before updating, and look much worse after updating!
The part of the report that talks about “arguments given for lower forecasts”. (The footnotes contain quotes from people expressing those views.)
(This last bullet point seems irrelevant to me. The question doesn’t specify that the experiments has to be public, and “In the absence of an authoritative source, the question will be resolved by a panel of experts.”)