Research analyst at Open Philanthropy. All opinions are my own.
Lukas Finnveden
There might not be any real disagreement. I’m just saying that there’s no direct conflict between “present people having material wealth beyond what they could possibly spend on themselves” and “virtually all resources are used in the way that totalist axiologies would recommend”.
What’s the argument for why an AI future will create lots of value by total utilitarian lights?
At least for hedonistic total utilitarianism, I expect that a large majority of expected-hedonistic-value (from our current epistemic state) will be created by people who are at least partially sympathetic to hedonistic utilitarianism or other value systems that value a similar type of happiness in a scope-sensitive fashion. And I’d guess that humans are more likely to have such values than AI systems. (At least conditional on my thinking that such values are a good idea, on reflection.)
Objective-list theories of welfare seems even less likely to be endorsed by AIs. (Since they seem pretty niche to human values.)
There’s certainly some values you could have that would mainly be concerned that we got any old world with a large civilization. Or that would think it morally appropriate to be happy that someone got to use the universe for what they wanted, and morally inappropriate to be too opinionated about who that should be. But I don’t think that looks like utilitarianism.
- 25 Apr 2024 17:41 UTC; 4 points) 's comment on Eric Neyman’s Shortform by (LessWrong;
I find it plausible that future humans will choose to create much fewer minds than they could. But I don’t think that “selfishly desiring high material welfare” will require this. Just the milky way has enough stars for each currently alive human to get an entire solar system each. Simultaneously, intergalactic colonization is probably possible (see here) and I think the stars in our own galaxy is less than 1-in-a-billion of all reachable stars. (Most of which are also very far away, which further contributes to them not being very interesting to use for selfish purposes.)
When we’re talking about levels of consumption that are greater than a solar system, and that will only take place millions of years in the future, it seems like the relevant kind of human preferences to be looking at is something like “aesthetic” preference. And so I think the relevant analogies are less that of present humans optimizing for their material welfare, but perhaps more something like “people preferring the aesthetics of a clean and untouched universe (or something else: like the aesthetics of a universe used for mostly non-sentient art) over the aesthetics of a universe which is packed with joy”.
I think your point “We may seek to rationalise the former [I personally don’t want to live in a large mediocre world, for self-interested reasons] as the more noble-seeming latter [desire for high average welfare]” is the kind of thing that might influence this aesthetic choice. Where “I personally don’t want to live in a large mediocre world, for self-interested reasons” would split into (i) “it feels bad to create a very unequal world where I have lots more resources than everyone else”, and (ii) “it feels bad to massively reduce the amount of resources that I personally have, to that of the average resident in a universe packed full with life”.
compared to MIRI people, or even someone like Christiano, you, or Joe Carlsmith probably have “low” estimates
Christiano says ~22% (“but you should treat these numbers as having 0.5 significant figures”) without a time-bound; and Carlsmith says “>10%” (see bottom of abstract) by 2070. So no big difference there.
I’ll hopefully soon make a follow-up post with somewhat more concrete projects that I think could be good. That might be helpful.
Are you more concerned that research won’t have any important implications for anyone’s actions, or that the people whose decisions ought to change as a result won’t care about the research?
Similary, ‘Politics is the Mind-Killer’ might be the rationalist idea that has aged worst—especially for its influences on EA.
What influence are you thinking about? The position argued in the essay seems pretty measured.
Politics is an important domain to which we should individually apply our rationality—but it’s a terrible domain in which to learn rationality, or discuss rationality, unless all the discussants are already rational. [...]
I’m not saying that I think we should be apolitical, or even that we should adopt Wikipedia’s ideal of the Neutral Point of View. But try to resist getting in those good, solid digs if you can possibly avoid it. If your topic legitimately relates to attempts to ban evolution in school curricula, then go ahead and talk about it—but don’t blame it explicitly on the whole Republican Party; some of your readers may be Republicans, and they may feel that the problem is a few rogues, not the entire party.
I think the strongest argument against EV-maximization in these cases is the Two-Envelopes Problem for Uncertainty about Brain-Size Valuation.
I liked this recent interview with Mark Dybul who worked on PEPFAR from the start: https://www.statecraft.pub/p/saving-twenty-million-lives
One interesting contrast with the conclusion in this post is that Dybul thinks that PEPFAR’s success was a direct consequence of how it didn’t involve too many people and departments early on — because the negotiations would have been too drawn out and too many parties would have tried to get pieces of control. So maybe a transparent process that embraced complexity wouldn’t have achieved much, in practice.
(At other parts in the process he leaned farther towards transparency than was standard — sharing a ton of information with congress.)
FWIW you can see more information, including some of the reasoning, on page 655 (# written on pdf) / 659 (# according to page searcher) of the report. (H/t Isabel.) See also page 214 for the definition of the question.
Some tidbits:
Experts started out much higher than superforecasters, but updated downwards after discussion. Superforecasters updated a bit upward, but less:
(Those are billions on the y-axis.)
This was surprising to me. I think the experts’ predictions look too low even before updating, and look much worse after updating!
The part of the report that talks about “arguments given for lower forecasts”. (The footnotes contain quotes from people expressing those views.)
Arguments given for lower forecasts (2024: <$40m, 2030: <$110m, 2050: ⩽$200m)
● Training costs have been stable around $10m for the last few years.1326
● Current trend increases are not sustainable for many more years.1327 One team cited this AI Impacts blog post.
● Major companies are cutting costs.1328
● Increases in model size and complexity will be offset by a combination of falling compute costs, pre-training, and algorithmic improvements.1329
● Large language models will probably see most attention in the near future, and these are bottlenecked by availability of data, which will lead to smaller models and less compute.1330
● Not all experiments will be public, and it is possible that the most expensive experiments will not be public.1331
(This last bullet point seems irrelevant to me. The question doesn’t specify that the experiments has to be public, and “In the absence of an authoritative source, the question will be resolved by a panel of experts.”)
It’s the crux between you and Ajeya, because you’re relatively more in agreement on the other numbers. But I think that adopting the xpt numbers on these other variables would slow down your own timelines notably, because of the almost complete lack of increase in spending.
That said, if the forecasters agreed with your compute requirements, they would probably also forecast higher spending.
in terms of saving “disability-adjusted life years” or DALYs, “a case of HIV/AIDS can be prevented for $11, and a DALY gained for $1” by improving the safety of blood transfusions and distributing condoms
These numbers are wild compared to eg current givewell numbers. My guess would be that they’re wrong, and if so, that this was a big part of why PEPFAR did comparatively better then expected. Or maybe that they were significantly less scalable (measured in cost of marginal life saved as a function of lives saved so far) than PEPFAR.
If the numbers were right, and you could save more lives than PEPFAR for 100x less money (or 30x (?) less after taking into account some falls in cost), I’m not sure I buy that the political feasibility of PEPFAR was greater than the much cheaper ask (a priori). At least I get very sympathetic to the then-economists.
(But again, I’d guess those numbers were probably wrong or unscalable?)
Nice, gotcha.
Incidentally, as its central estimate for algorithmic improvement, the takeoff speeds model uses AI and Efficiency’s ~1.7x per year, and then halves it to ~1.3x per year (because todays’ algorithmic progress might not generalize to TAI). If you’re at 2x per year, then you should maybe increase the “returns to software” from 1.25 to ~3.5, which would cut the model’s timelines by something like 3 years. (More on longer timelines, less on shorter timelines.)
Yeah sorry, I didn’t mean to say this directly contradicted anything you said. It just felt like a good reference that might be helpful to you or other people reading the thread. (In retrospect, I should have said that and/or linked it in response to the mention in your top-level comment instead.)
(Also, personally, I do care about how much effort and selection is required to find good retrodictions like this, so in my book “I didn’t look up the data on Google beforehand” is relevant info. But it would have been way more impressive if someone had been able to pull that off in 1890, and I agree this shouldn’t be confused for that.)
Re “it was incorrect by an order of magnitude”: that seems fine to me. If we could get that sort of precision for predicting TAI, that would be awesome and outperform any other prediction method I know about.
and notably there’s been perhaps a 2x speedup in algorithmic progress since 2022
I don’t understand this. Why would there be a 2x speedup in algorithmic progress?
And, as I think Eliezer said (roughly), there don’t seem to be many cases where new tech was predicted based on when some low-level metric would exceed the analogous metric in a biological system. [...] And the way in which machines perform tasks usually looks very different than how biological systems do it (bird vs. airplanes, etc.).
From Birds, Brains, Planes, and AI:
This data shows that Shorty [hypothetical character introduced earlier in the post] was entirely correct about forecasting heavier-than-air flight. (For details about the data, see appendix.) Whether Shorty will also be correct about forecasting TAI remains to be seen.
In some sense, Shorty has already made two successful predictions: I started writing this argument before having any of this data; I just had an intuition that power-to-weight is the key variable for flight and that therefore we probably got flying machines shortly after having comparable power-to-weight as bird muscle. Halfway through the first draft, I googled and confirmed that yes, the Wright Flyer’s motor was close to bird muscle in power-to-weight. Then, while writing the second draft, I hired an RA, Amogh Nanjajjar, to collect more data and build this graph. As expected, there was a trend of power-to-weight improving over time, with flight happening right around the time bird-muscle parity was reached.
I think my biggest disagreement with the takeoff speeds model is just that it’s conditional on things like: no coordinated delays, regulation, or exogenous events like war, and doesn’t take into account model uncertainty.
Cool, I thought that was most of the explanation for the difference in the median. But I thought it shouldn’t be enough to explain the 14x difference between 28% and 2% by 2030, because I think there should be a ≥20% chance that there are no significant coordinated delays, regulation, or relevant exogenous events if AI goes wild in the next 7 years. (And that model uncertainty should work to increase rather than decrease the probability, here.)
If you think robotics would definitely be necessary, then I can see how that would be significant.
But I think it’s possible that we get a software-only singularity. Or more broadly, simultaneously having (i) AI improving algorithms (...improving AIs), (ii) a large fraction of the world’s fab-capacity redirected to AI chips, and (iii) AIs helping with late-stage hardware stuff like chip-design. (I agree that it takes a long time to build new fabs.) This would simultaneously explain why robotics aren’t necessary (before we have crazy good AI) and decrease the probability of regulatory delays, since the AIs would just need to be deployed inside a few companies. (I can see how regulation would by-default slow down some kinds of broad deployment, but it seems super unclear whether there will be regulation put in place to slow down R&D and internal deployment.)
My own distribution over the training FLOP for transformative AI is centered around ~10^32 FLOP using 2023 algorithms, with a standard deviation of about 3 OOM.
Thanks for the numbers!
For comparison, takeoffspeeds.com has an aggressive monte-carlo (with a median of 10^31 training FLOP) that yields a median of 2033.7 for 100% automation — and a p(TAI < 2030) of ~28%. That 28% is pretty radically different from your 2%. Do you know your biggest disagreements with that model?
The 1 OOM difference in training FLOP presumably doesn’t explain that much. (Although maybe it’s more, because takeoffspeeds.com talks about “AGI” and you talk about “TAI”. On the other hand, maybe your bar for “transformative” is lower than 100% automation.)
Some related responses to stuff in your post:
The most likely cause of such a sudden acceleration seems to be that pre-superintelligent systems could accelerate technological progress. But, as I have just argued above, a rapid general acceleration of technological progress from pre-superintelligent AI seems very unlikely in the next few years.
You argued that AI labor would be small in comparison to all of human labor, if we got really good software in the next 4 years. But if we had recently gotten such insane gains in ML-capabilities, people would want to vastly increase investment in ML-research (and hardware production) relative to everything else in the world. Normally, labor spent on ML research would lag behind, because it takes a long time to teach a large number of humans the requisite skills. But for each skill, you’d only need to figure out how to teach AI about it once, and then all 10 million AIs would be able to do it. (There would certainly be some lag, here, too. Your posts says “lag for AI will likely be more than a year”, which I’m sympathetic to, but there’s time for that.)
When I google “total number of ml researchers”, the largest number I see is 300k and I think the real answer is <100k. So I don’t think a huge acceleration in AI-relevant technological progress before 2030 is out of the question.
(I think it’s plausible we should actually be thinking about the best ML researchers rather than just counting up the total number. But I don’t think it’d be crazy for AIs to meet that bar in the hypothetical you paint. Given the parallelizability of AI, it’s both the case that (i) it’s worth spending much more effort on teaching skills to AIs, and (ii) that it’s possible for AIs to spend much more effective time on learning.)
I am also inclined to cut some probability away from short timelines given the lack of impressive progress in general-purpose robotics so far, which seems like an important consideration given that the majority of labor in the world currently requires a physical component.
Mostly not ML research.
Also, if the AIs are bottlenecked by motor skills, humans can do that part. When automating small parts of the total economy (like ML research or hardware production), there’s room to get more humans into those industries to do all the necessary physical tasks. (And at the point when AI cognitive output is large compared to the entire human workforce, you can get a big boost in total world output by having humans switch into just doing manual labor, directed by AIs.)
However, my unconditional view is somewhat different. After considering all potential delays (including regulation, which I think is likely to be substantial) and model uncertainty, my overall median TAI timeline is somewhere between 20-30 years from now, with a long tail extending many decades into the future.
I can see how stuff like regulation would feature in many worlds, but it seems high variance and like it should allow for a significant probability of ~no delay.
Also, my intuition is that 2% is small enough in the relevant context that model uncertainty should push it up rather than down.
- 14 Jun 2023 9:35 UTC; 3 points) 's comment on A compute-based framework for thinking about the future of AI by (
The quote continues:
Of the remaining 5 %, around 70 % would eventually be reached by other civilisations, while 30 % would have remained empty in our absence.
I think the 70%/30% numbers are the relevant ones for comparing human colonization vs. extinction vs. misaligned AGI colonization. (Since 5% cuts the importance of everything equally.)
...assuming defensive dominance in space, where you get to keep space that you acquire first. I don’t know what happens without that.
This would suggest that if we’re indifferent between space being totally uncolonized and being colonized by a certain misaligned AGI and if we’re indifferent between aliens and humans colonizing space: then preventing that AGI is ~3x as good as preventing extinction.
If we value aliens less than humans, it’s less. If we value the AGI positively, it’s also less. If we value the AGI negatively, it’d be more.
If AGI systems had goals that were cleanly separated from the rest of their cognition, such that they could learn and self-improve without risking any value drift (as long as the values-file wasn’t modified), then there’s a straightforward argument that you could stabilise and preserve that system’s goals by just storing the values-file with enough redundancy and digital error correction.
So this would make section 6 mostly irrelevant. But I think most other sections remain relevant, insofar as people weren’t already convinced that being able to build stable AGI systems would enable world-wide lock-in.
Therefore, it seems to me that most of your doc assumes we’re in this scenario [without clean separation between values and other parts]?
I was mostly imagining this scenario as I was writing, so when relevant, examples/terminology/arguments will be taylored for that, yeah.
Here’s one line of argument:
Positive argument in favor of humans: It seems pretty likely that whatever I’d value on-reflection will be represented in a human future, since I’m a human. (And accordingly, I’m similar to many other humans along many dimensions.)
If AI values where sampled ~randomly (whatever that means), I think that the above argument would be basically enough to carry the day in favor of humans.
But here’s a salient positive argument in favor of why AIs’ values will be similar to mine: People will be training AIs to be nice and helpful, which will surely push them towards better values.
However, I also expect people to be training AIs for obedience and, in particular, training them to not disempower humanity. So if we condition on a future where AIs disempower humanity, we evidentally didn’t have that much control over their values. This signiciantly weakens the strength of the argument “they’ll be nice because we’ll train them to be nice”.
In addition: human disempowerment is more likely to succeed if AIs are willing to egregiously violate norms, such a by lying, stealing, and killing. So conditioning on human disempowerment also updates me somewhat towards egregiously norm-violating AI. That makes me feel less good about their values.
Another argument is that, in the near term, we’ll train AIs to act nicely on short-horizon tasks, but we won’t particularly train them to deliberate and reflect on their values well. So even if “AIs’ best-guess stated values” are similar to “my best-guess stated values”, there’s less reason to belive that “AIs’ on-reflection values” are similar to “my on-reflection values”. (Whereas the basic argument of my being similar to humans still work ok: “my on-reflection values” vs. “other humans’ on-reflection values”.)
Edit: Oops, I accidentally switched to talking about “my on-reflection values” rather than “total utilitarian values”. The former is ultimately what I care more about, though, so it is what I’m more interested in. But sorry for the switch.