“Second, the primary benefits—higher incomes and earlier biomedical breakthroughs—are also broadly shared; they are not gated to the single lab that crosses the finish line first.”
If you look at the leaders of major AI companies you see people like Elon Musk and others who are concerned with getting to AGI before others who they distrust and fear. They fear immense power in the hands of rivals with conflicting ideologies or in general.
OpenAI was founded and funded in significant part based on Elon Musk’s fear of the consequences of the Google leadership having power over AGI (in particular in light of statements suggesting producing AI that lead to human extinction would be OK). States fear how immense AGI power will be used against them. Power, including the power to coerce or harm others, and relative standing, are more important there than access to advanced medicine or broad prosperity for the competitive dynamics.
In the shorter term, an AI company whose models are months behind may find that their APIs have negative margins while competitors earn 50% margins. Avoiding falling behind is increasingly a matter of institutional survival for AI companies, and a powerful motive to increment global risk a small amount to profit rather than going bankrupt.
The motive I see to take incremental risk is “if AI wipes out humanity I’m just as dead either way, and my competitors are similarly or more dangerous than me (self-serving bias plays into this) but there are huge ideological or relative position (including corporate survival) gains from control over powerful AGI that are only realized by being fast, so I should take a bit more risk of disaster conditional on winning to raise the chance of winning.” This dynamic looks to apply to real AI company leaders who claim big risks of extinction while rushing forward.
With multiple players doing that, the baseline level of risk from another lab goes up, and the strategic appeal of incrementing it one more time for relative advantage continues. You can get up to very high levels of risk perceived by labs that way, accepting each small increment of risk as minor compared to the risk posed by other labs and the appeal of getting a lead, setting a new worse baseline for others to compete against.
And the epistemic variation makes it all worse, where the most unconcerned players set a higher baseline risk spontaneously.
CarlShulman
Right, those comments were about the big pause letter, which while nominally global in fact only applied at the time to the leading US lab, and even if voluntarily complied with would not affect the PRC’s efforts to catch up in semiconductor technology, nor Chinese labs catching up algorithmically (as they have partially done).
Sure, these are possible. My view above was about expectations. #1 and #2 are possible, although look less likely to me. There’s some truth to #3, but the net effect is still gap closing, and the slowing tends to be more earlier (when it is less impactful) than later.
On my view the OP’s text citing me left out the most important argument from the section they linked: the closer and tighter an AI race is at the international level as the world reaches strong forms of AGI and ASI, the less slack there is for things like alignment. The US and Chinese governments have the power to prohibit their own AI companies from negligently (or willfully) racing to create AI that overthrows them, if they believed that was a serious risk and wanted to prioritize stopping it. That willingness will depend on scientific and political efforts, but even if those succeed enormously, the international cooperation between the US and China will pose additional challenges. The level of conviction in risks governments would need would be much higher than to rein in their own companies without outside competition, and there would be more political challenges.
Absent an agreement with enough backing it to stick, slowdown by the US tightens the international gap in AI and means less slack (and less ability to pause when it counts) and more risk of catastrophe in the transition to AGI and ASI. That’s a serious catastrophe-increasing effect of unilateral early (and ineffectual at reducing risk) pauses. You can support governments having the power to constrain AI companies from negligently destroying them, and international agreements between governments to use those powers in a coordinated fashion (taking steps to assure each other in doing so), while not supporting unilateral pause to make the AI race even tighter.
I think there are some important analogies with nuclear weapons. I am a big fan of international agreements to reduce nuclear arsenals, but I oppose the idea of NATO immediately destroying all its nuclear weapons and then suffering nuclear extortion from Russia and China (which would also still leave the risk of nuclear war between the remaining nuclear states). Unilateral reductions as a gesture of good faith that still leave a deterrent can be great, but that’s much less costly than evening up the AI race (minimal arsenals for deterrence are not that large).
“So, at least when you go to the bargaining table, if not here, we need to ask for fully what we want without pre-surrendering. “Pause AI!”, not “I know it’s not realistic to pause, but maybe you could tap the brakes?” What’s realistic is to some extent what the public says is realistic.”
I would think your full ask should be the international agreement between states, and companies regulated by states in accord with that, not unilateral pause by the US (currently leading by a meaningful margin) until AI competition is neck-and-neck.
And people should consider both the possibilities of ultimate success and of failure with your advocacy, and be wary of intermediate goals that make things much worse if you ultimately fail with global arrangements but make them only slightly more likely to succeed. I think it is certainly possible some kind of inclusive (e.g. including all the P-5) international deal winds up governing and delaying the AGI/ASI transition, but it is also extremely plausible that it doesn’t, and I wouldn’t write off consequences in the latter case.
I have two views in the vicinity. First, there’s a general issue that human moral practice generally isn’t just axiology, but also includes a number of elements that are built around interacting with other people with different axiologies, e.g. different ideologies coexisting in a liberal society, different partially selfish people or family groups coexisting fairly while preferring different outcomes. Most flavors of utilitarianism ignore those elements, and ceteris paribus would, given untrammeled power, call for outcomes that would be ruinous for ~all currently existing beings, and in particular existing societies. That could be classical hedonistic utilitarianism diverting the means of subsistence from all living things as we know them to fuel more hedonium, negative-leaning views wanting to be rid of all living things with any prospects for having or causing pain or dissatisfaction, or playing double-or-nothing with the universe until it is destroyed with probability 1.
So most people have reason to oppose any form of utilitarianism getting absolute power (and many utilitarianisms would have reason to self-efface into something less scary and dangerous and prone to using power in such ways that would have a better chance of realizing more of what it values by less endangering other concerns). I touch on this in an article with Elliott Thornley.
I have an additional objection to hedonic-only views in particular, in that they don’t even take as inputs many of people’s concerns, and so more easily wind up hostile to particular individuals supposedly for those individuals’ sake. E.g. I would prefer to retain my memories and personal identity, knowledge and autonomy, rather than be coerced into forced administration of pleasure drugs. I also would like to achieve various things in the world in reality, and would prefer that to an experience machine. A normative scheme that doesn’t even take those concerns as inputs is fairly definitely going to run roughshod over them, even if some theories that take them as inputs might do so too.- Oct 11, 2024, 2:19 PM; 6 points) 's comment on Is RP’s Moral Weights Project too animal friendly? Four critical junctures by (
Physicalists and illusionists mostly don’t agree with the identification of ‘consciousness’ with magical stuff or properties bolted onto the psychological or cognitive science picture of minds. All the real feelings and psychology that drive our thinking, speech and action exist. I care about people’s welfare, including experiences they like, but also other concerns they have (the welfare of their children, being remembered after they die), and that doesn’t hinge on magical consciousness that we, the physical organisms having this conversation, would have no access to. The illusion is of the magical part.
Re desires, the main upshot of non-dualist views of consciousness I think is responding to arguments that invoke special properties of conscious states to say they matter but not other concerns of people. It’s still possible to be a physicalist and think that only selfish preferences focused on your own sense impressions or introspection matter, it just looks more arbitrary.I think this is important because it’s plausible that many AI minds will have concerns mainly focused on the external world rather than their own internal states, and running roughshod over those values because they aren’t narrowly mentally-self-focused seems bad to me.
Here’s a fairly safe prediction: most of the potential harm from AI is potential harm to nonhuman animals.
I would think for someone who attended an AI, Animals, and Digital Minds conference it should look like an extremely precarious prediction, as AIs will likely immensely outnumber nonhuman animals, and could have much more of most features we could use in measuring ‘harm’?
Rapid fire:
Nearterm extinction risk from AI is wildly closer to total AI x-risk than the nuclear analog
My guess is that nuclear war interventions powerful enough to be world-beating for future generations would look tremendous in averting current human deaths, and most of the WTP should come from that if one has a lot of WTP related to each of those worldviews
Re suspicious convergence, what do you want to argue with here? I’ve favored allocation on VOI and low-hanging fruit on nuclear risk not leveraging AI related things in the past less than 1% of my marginal AI allocation (because of larger more likely near term risks from AI with more tractability and neglectedness); recent AI developments tend to push that down, but might surface something in the future that is really leveraged on avoiding nuclear war
I agree not much has been published in journals on the impact of AI being developed in dictatorships
Re lock-in I do not think it’s remote (my views are different from what that paper limited itself to) for a CCP-led AGI future,
I agree that people should not focus on nuclear risk as a direct extinction risk (and have long argued this), see Toby’s nuke extinction estimates as too high, and would assess measures to reduce damage from nuclear winter to developing neutral countries mainly in GiveWell-style or ordinary CBA terms, while considerations about future generations would favor focus on AI, and to a lesser extent bio.
However, I do think this wrongly downplays the effects on our civilization beyond casualties and local damage of a nuclear war that wrecks the current nuclear powers, e.g. on disrupting international cooperation, rerolling contingent nice aspects of modern liberal democracy, or leading to release of additional WMD arsenals (such as bioweapons, while disrupting defense against those weapons). So the ‘can nuclear war with current arsenals cause extinction’ question misses most of the existential risk from nuclear weapons, which is indirect in contributing to other risks that could cause extinction or lock-in of permanent awful regimes. I think marginal philanthropic dollars can save more current lives and help the overall trajectory of civilization more on other risks, but I think your direct extinction numbers above do greatly underestimate how much worse the future should be expected to be given a nuclear war that laid waste to, e.g. NATO+allies and the Russian Federation.
You dismiss that here:
> Then discussions move to more poorly understood aspects of the risk (e.g. how the distribution of values after a nuclear war affects the longterm values of transformative AI).
But I don’t think it’s a huge stretch to say that a war with Russia largely destroying the NATO economies (and their semiconductor supply chains), leaving the PRC to dominate the world system and the onrushing creation of powerful AGI, makes a big difference to the chance of locked-in permanent totalitarianism and the values of one dictator running roughshod over the low-hanging fruit of many others’ values. That’s very large compared to these extinction effects. It also doesn’t require bets on extreme and plausibly exaggerated nuclear winter magnitude.
Similarly, the chance of a huge hidden state bioweapons program having its full arsenal released simultaneously (including doomsday pandemic weapons) skyrockets in an all-out WMD war in obvious ways.
So if one were to find super-leveraged ways reduce the chance of nuclear war (this applied less to measures to reduce damage to nonbelligerent states) then in addition to beating GiveWell at saving current lives, they could have big impacts on future generations. Such opportunities are extremely scarce, but the bar for looking good in future generation impacts is less than I think this post suggests.
Thank you for the comment Bob.
I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones). E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of the same set of robots and computations with less integration (and overlap of shared features, or top-level control), ceteris paribus.
Indeed, I would be wary of treating the integrated mind as though welfare stakes for it were half or a tenth as great, seeing that as a potential source of moral catastrophe, like ignoring the welfare of minds not based on proteins. E.g. having tasks involving suffering and frustration done by large integrated minds, and pleasant ones done by tiny minds, while increasing the amount of mental activity in the former. It sounds like the combination of object-level and methodological takes attached to these reports would favor ignoring almost completely the integrated mind.Incidentally, in a world where small animals are being treated extremely badly and are numerous, I can see a temptation to err in their favor, since even overestimates of their importance could be shifting things in the right marginal policy direction. But thinking about the potential moral catastrophes on the other side helps sharpen the motivation to get it right.
In practice, I don’t prioritize moral weights issues in my work, because I think the most important decisions hinging on it will be in an era with AI-aided mature sciences of mind, philosophy and epistemology. And as I have written regardless of your views about small minds and large minds, it won’t be the case that e.g. humans are utility monsters of impartial hedonism (rather than something bigger, smaller, or otherwise different), and grounds for focusing on helping humans won’t be terminal impartial hedonistic in nature. But from my viewpoint baking in that integration (and unified top-level control or mental overlap of some parts of computation) close to eliminates mentality or welfare (vs less integrated collections of computations) seems bad in non-Pascalian fashion.
Lots of progress on AI, alignment, and governance. This sets up a position where it is likely that a few years later there’s an AI capabilities explosion and among other things:
Mean human wealth skyrockets, while AI+robots make cultured meat and substitutes, as well as high welfare systems (and reengineering biology) cheap relative to consumers’ wealth; human use of superintelligent AI advisors leads to global bans on farming with miserable animals and/or all farming
Perfect neuroscientific and psychological knowledge of humans and animals, combined with superintelligent advisors, lead to concern for wild animals; robots with biological like abilities and greater numbers and capacities can safely adjust wild animal ecologies to ensure high welfare at negligible material cost to humanity, and this is done
If it was 2028 it would be more like ‘the above has already happened’ rather than conditions being well set up for it.
Not much new on that front besides continuing to back the donor lottery in recent years, for the same sorts of reasons as in the link, and focusing on research and advising rather than sourcing grants.
A bit, but more on the willingness of AI experts and some companies to sign the CAIS letter and lend their voices to the view ‘we should go forward very fast with AI, but keep an eye out for better evidence of danger and have the ability to control things later.‘
My model has always been that the public is technophobic, but that ‘this will be constrained like peaceful nuclear power or GMO crops’ isn’t enough to prevent a technology that enables DSA and OOMs (and nuclear power and GMO crops exist, if AGI exists somewhere that place outgrows the rest of the world if the rest of the world sits on the sidelines). If leaders’ understanding of the situation is that public fears are erroneous, and going forward with AI means a hugely better economy (and thus popularity for incumbents) and avoiding a situation where abhorred international rivals can safely disarm their military, then I don’t expect it to be stopped. So the expert views, as defined by who the governments view as experts, are central in my picture.
Visible AI progress like ChatGPT strengthens ‘fear AI disaster’ arguments but at the same time strengthens ‘fear being behind in AI/others having AI’ arguments. The kinds of actions that have been taken so far are mostly of the latter type (export controls, etc), and measures to monitor the situation and perhaps do something later if the evidential situation changes. I.e. they reflect the spirit of the CAIS letter, which companies like OpenAI and such were willing to sign, and not the pause letter which many CAIS letter signatories oppose.
The evals and monitoring agenda is an example of going for value of information rather than banning/stopping AI advances, like I discussed in the comment, and that’s a reason it has had an easier time advancing.
I don’t want to convey that there was no discussion, thus my linking the discussion and saying I found it inadequate and largely missing the point from my perspective. I made an edit for clarity, but would accept suggestions for another.
I have never calculated moral weights for Open Philanthropy, and as far as I know no one has claimed that. The comment you are presumably responding to began by saying I couldn’t speak for Open Philanthropy on that topic, and I wasn’t.
Thanks, I was referring to this as well, but should have had a second link for it as the Rethink page on neuron counts didn’t link to the other post. I think that page is a better link than the RP page I linked, so I’ll add it in my comment.
I’m not planning on continuing a long thread here, I mostly wanted to help address the questions about my previous comment, so I’ll be moving on after this. But I will say two things regarding the above. First, this effect (computational scale) is smaller for chickens but progressively enormous for e.g. shrimp or lobster or flies. Second, this is a huge move and one really needs to wrestle with intertheoretic comparisons to justify it:
I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above.
Suppose we compared the mass of the human population of Earth with the mass of an individual human. We could compare them on 12 metrics, like per capita mass, per capita square root mass, per capita foot mass… and aggregate mass. If we use the equal-weighted geometric mean, we will conclude the individual has a mass within an order of magnitude of the total Earth population, instead of billions of times less.
I can’t speak for Open Philanthropy, but I can explain why I personally was unmoved by the Rethink report (and think its estimates hugely overstate the case for focusing on tiny animals, although I think the corrected version of that case still has a lot to be said for it).
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
However, I say “naively” because this doesn’t actually work, due to two-envelope effects...whenever you’re tempted to multiply such numbers by something, remember two-envelope effects!)
[Edited for clarity] I was not satisfied with Rethink’s attempt to address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around.
It is not unthinkably improbable that an elephant brain where reinforcement from a positive or negative stimulus adjust millions of times as many neural computations could be seen as vastly more morally important than a fruit fly, just as one might think that a fruit fly is much more important than a thermostat (which some suggest is conscious and possesses preferences). Since on some major functional aspects of mind there are differences of millions of times, that suggests a mean expected value orders of magnitude higher for the elephant if you put a bit of weight on the possibility that moral weight scales with the extent of, e.g. the computations that are adjusted by positive and negative stimuli. A 1% weight on that plausible hypothesis means the expected value of the elephant is immense vs the fruit fly. So there will be something that might get lumped in with ‘overwhelming hierarchicalism’ in the language of the top-level post. Rethink’s various discussions of this issue in my view missed the mark.
Go the other way and fix the value of the elephant at 1, and the possibility that value scales with those computations is treated as a case where the fly is worth ~0. Then a 1% or even 99% credence in value scaling with computation has little effect, and the elephant-fruit fly ratio is forced to be quite high so tiny mind dominance is almost automatic. The same argument can then be used to make a like case for total dominance of thermostat-like programs, or individual neurons, over insects. And then again for individual electrons.
As I see it, Rethink basically went with the ‘ratios to fixed human value’, so from my perspective their bottom-line conclusions were predetermined and uninformative. But the alternatives they ignore lead me to think that the expected value of welfare for big minds is a lot larger than for small minds (and I think that can continue, e.g. giant AI minds with vastly more reinforcement-affected computations and thoughts could possess much more expected welfare than humans, as many humans might have more welfare than one human).
I agree with Brian Tomasik’s comment from your link:the moral-uncertainty version of the [two envelopes] problem is fatal unless you make further assumptions about how to resolve it, such as by fixing some arbitrary intertheoretic-comparison weights (which seems to be what you’re suggesting) or using the parliamentary model.
By the same token, arguments about the number of possible connections/counterfactual richness in a mind could suggest superlinear growth in moral importance with computational scale. Similar issues would arise for theories involving moral agency or capacity for cooperation/game theory (on which humans might stand out by orders of magnitude relative to elephants; marginal cases being socially derivative), but those were ruled out of bounds for the report. Likewise it chose not to address intertheoretic comparisons and how those could very sharply affect the conclusions. Those are the kinds of issues with the potential to drive massive weight differences.
I think some readers benefitted a lot from reading the report because they did not know that, e.g. insects are capable of reward learning and similar psychological capacities. And I would guess that will change some people’s prioritization between different animals, and of animal vs human focused work. I think that is valuable. But that information was not new to me, and indeed I had argued for many years that insects met a lot of the functional standards one could use to identify the presence of well-being, and that even after taking two-envelopes issues and nervous system scale into account expected welfare at stake for small wild animals looked much larger than for FAW.
I happen to be a fan of animal welfare work relative to GHW’s other grants at the margin because animal welfare work is so highly neglected (e.g. Open Philanthropy is a huge share of world funding on the most effective FAW work but quite small compared to global aid) relative to the case for views on which it’s great. But for me Rethink’s work didn’t address the most important questions, and largely baked in its conclusions methodologically.- Multiplier Arguments are often flawed by Oct 13, 2024, 9:25 PM; 192 points) (
- Long Reflection Reading List by Mar 24, 2024, 4:27 PM; 92 points) (
- The Marginal $100m Would Be Far Better Spent on Animal Welfare Than Global Health by Oct 9, 2024, 8:45 PM; 79 points) (
- I’m interviewing Carl Shulman — what should I ask him? by Dec 8, 2023, 4:48 PM; 53 points) (
- I’m interviewing Carl Shulman — what should I ask him? by Dec 8, 2023, 4:48 PM; 53 points) (
- Dec 27, 2023, 3:00 AM; 42 points) 's comment on A year of wins for farmed animals by (
- Nov 27, 2023, 11:56 PM; 30 points) 's comment on Open Phil Should Allocate Most Neartermist Funding to Animal Welfare by (
- Oct 11, 2024, 2:19 PM; 6 points) 's comment on Is RP’s Moral Weights Project too animal friendly? Four critical junctures by (
One can value research and find it informative or worth doing without being convinced of every view of a given researcher or team. Open Philanthropy also sponsored a contest to surface novel considerations that could affect its views on AI timelines and risk. The winners mostly present conclusions or considerations on which AI would be a lower priority, but that doesn’t imply that the judges or the institution changed their views very much in that direction.
At large scale, Information can be valuable enough to buy even if it only modestly adjusts proportional allocations of effort, the minimum bar for funding a research project with hundreds of thousands or millions of dollars presumably isn’t that one pivots billions of dollars on the results with near-certainty.
I think there’s some talking past each other happening.
I am claiming that there are real coordination problems that lead even actors who believe in a large amount of AI risk to think that they need to undertake risky AI development (or riskier) for private gain or dislike of what others would do. I think that dynamic will likely result in future governments (and companies absent government response) taking on more risk than they otherwise would, even if they think it’s quite a lot of risk.
I don’t think that most AI companies or governments would want to create an indefinite global ban on AI absent coordination problems, because they think benefits exceed costs, even those who put 10%+ on catastrophic outcomes, like Elon Musk or Dario Amodei (e.g. I put 10%+ on disastrous outcomes from AI development but wouldn’t want a permanent ban, even Eliezer Yudkowsky doesn’t want a permanent ban on AI).
I do think most of the AI company leadership that actually believes they may succeed in creating AGI or ASI would want to be able to take a year or two for safety testing and engineering if they were approaching powerful AGI and ASI absent issues of commercial and geopolitical competition (and I would want that too). And I think future US and Chinese governments, faced with powerful AGI/ASI and evidence of AI misbehavior and misalignment, would want to do so save for geopolitical rivalry.
Faced with that competition each actor taking a given level of risk to the world doesn’t internalize it, only the increment of risk over their competitor. And companies and states both get a big difference in value from being incrementally ahead vs behind competitors. For companies it can be huge profits vs bankruptcy. For states (and several AI CEOs who actually believe in AGI and worry about how others would develop and use it, I agree most of the hyperscaler CEOs and the like are looking at things from a pure business perspective and don’t even believe in AGI/ASI) there is the issue of power (among other non-financial motives) as a reason to care about being first.