Focusing on empirical results:Learning to summarize from human feedback was good, for several reasons.I liked the recent paper empirically demonstrating objective robustness failures hypothesized in earlier theoretical work on inner alignment.
Side note: Bostrom does not hold or argue for 100% weight on total utilitarianism such as to take overwhelming losses on other views for tiny gains on total utilitarian stances. In Superintelligence he specifically rejects an example extreme tradeoff of that magnitude (not reserving one galaxy’s worth of resources out of millions for humanity/existing beings even if posthumans would derive more wellbeing from a given unit of resources).I also wouldn’t actually accept a 10 million year delay in tech progress (and the death of all existing beings who would otherwise have enjoyed extended lives from advanced tech, etc) for a 0.001% reduction in existential risk.
By that token most particular scientific experiments or contributions to political efforts may be such: e.g. if there is a referendum to pass a pro-innovation regulatory reform and science funding package, a given donation or staffer in support of it is very unlikely to counterfactually tip it into passing, although the expected value and average returns could be high, and the collective effort has a large chance of success.
Your 3 items cover good+top priority, good+not top priority, and bad+top priority, but not #4, bad+not top priority.I think people concerned with x-risk generally think that progress studies as a program of intervention to expedite growth is going to have less expected impact (good or bad) on the history of the world per unit of effort, and if we condition on people thinking progress studies does more harm than good, then mostly they’ll say it’s not important enough to focus on arguing against at the current margin (as opposed to directly targeting urgent threats to the world). Only a small portion of generalized economic expansion will go to the most harmful activities (and damage there comes from expediting dangerous technologies in AI and bioweapons that we are improving in our ability to handle, so that delay would help) or to efforts to avert disaster, so there is much more leverage focusing narrowly on the most important areas. With respect to synthetic biology in particular, I think there is a good case for delay: right now the capacity to kill most of the world’s population with bioweapons is not available in known technologies (although huge secret bioweapons programs like the old Soviet one may have developed dangerous things already), and if that capacity is delayed there is a chance it will be averted or much easier to defend against via AI, universal sequencing, and improvements in defenses and law enforcement. This is even moreso for those sub-areas that most expand bioweapon risk. That said, any attempt to discourage dangerous bioweapon-enabling research must compete against other interventions (improved lab safety, treaty support, law enforcement, countermeasure platforms, etc), and so would have to itself be narrowly targeted and leveraged. With respect to artificial intelligence, views on sign vary depending on whether one thinks the risk of an AI transition is getting better or worse over time (better because of developments in areas like AI alignment and transparency research, field-building, etc; or worse because of societal or geopolitical changes). Generally though people concerned with AI risk think it much more effective to fund efforts to find alignment solutions and improved policy responses (growing them from a very small base, so cost-effectiveness is relatively high) than a diffuse and ineffective effort to slow the technology (especially in a competitive world where the technology would be developed elsewhere, perhaps with higher transition risk).For most other areas of technology and economic activity (e.g. energy, agriculture, most areas of medicine) x-risk/longtermist implications are comparatively small, suggesting a more neartermist evaluative lens (e.g. comparing more against things like GiveWell).Long-lasting (centuries) stagnation is a risk worth taking seriously (and the slowdown of population growth that sustained superexponential growth through history until recently points to stagnation absent something like AI to ease the labor bottleneck), but seems a lot less likely than other x-risk. If you think AGI is likely this century then we will return to the superexponential track (but more explosively) and approach technological limits to exponential growth followed by polynomial expansion in space. Absent AGI or catastrophic risk (although stagnation with advanced WMD would increase such risk), permanent stagnation also looks unlikely based on the capacities of current technology given time for population to grow and reach frontier productivity.I think the best case for progress studies being top priority would be strong focus on the current generation compared to all future generations combined, on rich country citizens vs the global poor, inhabit and on technological progress over the next few decades, rather than in 2121. But given my estimates of catastrophic risk and sense of the interventions, at the current margin I’d still think that reducing AI and biorisk do better for current people than the progress studies agenda per unit of effort.I wouldn’t support arbitrary huge sacrifices of the current generation to reduce tiny increments of x-risk, but at the current level of neglectedness and impact (for both current and future generations) averting AI and bio catastrophe looks more impactful without extreme valuations. As such risk reduction efforts scale up marginal returns would fall and growth boosting interventions would become more competitive (with a big penalty for those couple of areas that disproportionately pose x-risk).That said, understanding tech progress, returns to R&D, and similar issues also comes up in trying to model and influence the world in assorted ways (e.g. it’s important in understanding AI risk, or building technological countermeasures to risks to long term development). I have done a fair amount of investigation that would fit into progress studies as an intellectual enterprise for such purposes.I also lend my assistance to some neartermist EAresearch focused on growth, in areas that don’t very disproportionately increase x-risk, and to development of technologies that make it more likely things will go better.
Robin Hanson argues in Age of Em that annualized growth rates will reach over 400,000% as a result of automation of human labor with full substitutes (e.g. through brain emulations)! He’s a weird citation for thinking the same technology can’t manage 20% growth.”I really don’t have strong arguments here. I guess partly from experience working on an automated trading system (i.e. actually trying to automate something)”This and the usual economist arguments against fast AGI growth seem to be more about denying the premise of ever succeeding at AGI/automating human substitute minds (by extrapolation from a world where we have not yet built human substitutes to conclude they won’t be produced in the future), rather than addressing the growth that can then be enabled by the resulting AI.
I find that 57% very difficult to believe. 10% would be a stretch. Having intelligent labor that can be quickly produced in factories (by companies that have been able to increase output by millions of times over decades), and do tasks including improving the efficiency of robots (already cheap relative to humans where we have the AI to direct them, and that before reaping economies of scale by producing billions) and solar panels (which already have energy payback times on the order of 1 year in sunny areas), along with still abundant untapped energy resources orders of magnitude greater than our current civilization taps on Earth (and a billionfold for the Solar System) makes it very difficult to make the AGI but no TAI world coherent.Cyanobacteria can double in 6-12 hours under good conditions, mice can grow their population more than 10,000x in a year. So machinery can be made to replicate quickly, and trillions of von Neumann equivalent researcher-years (but with AI advantages) can move us further towards that from existing technology. I predict that cashing out the given reasons into detailed descriptions will result in inconsistencies or very implausible requirements.
She does talk about century plus timelines here and there.
I suspect there are biases in the EA conversation where hedonistic-compatible arguments get discussed more than reasons that hedonistic utilitarians would be upset by, and intuitions coming from other areas may then lead to demand and supply subsidies for such arguments.
“I would guess most arguments for global health and poverty over animal welfare fall under the following:
- animals are not conscious or less conscious than humans- animals suffer less than humans
“I’m pretty skeptical that these arguments descriptively account for most of the people explicitly choosing global poverty interventions over animal welfare interventions, although they certainly account for some people. Polls show wide agreement that birds and mammals are conscious and have welfare to at least some degree. And I think most models on which degree of consciousness (in at least some senses) varies greatly, it’s not so greatly that one would say that, e.g. it’s more expensive to improve consciousness-adjusted welfare in chickens than humans today. And I say that as someone who thinks it pretty plausible that there are important orders-of-magnitude differences in quantitative aspects of consciousness. I’d say descriptively the bigger thing is people just feeling more emotional/moral obligations to humans than other animals, not thinking human welfare varies a millionfold more, in the same way that people who choose to ‘donate locally’ in rich communities where cost to save a life is hundreds of times greater than abroad don’t think that poor foreigners are a thousand times less conscious, even as they tradeoff charitable options as though weighting locals hundreds of times more than foreigners.An explicit philosophical articulation of this is found in Shelly Kagan’s book on weighing the interests of different animals. While even on Kagan’s view factory farming is very bad, he describes a view that assigns greater importance of interests of a given strength for beings with more of certain psychological properties (or counterfactual potential for those properties). The philosopher Mary Anne Warren articulates something similar in her book on moral status, which assigns increasing moral status on the basis of a number of grounds including life (possessed by plants and bacteria, and calling for some status), consciousness, capacity to engage in reciprocal social relations, actual relationships, moral understanding, readinesss to forbear in mutual cooperation, various powers, etc.I predict that if you polled philosophers on cases involving helping different numbers of various animals, those sorts of accounts would be more frequent explanations of the results than doubt about animal consciousness (as a binary or quantitative scale).This would be pretty susceptible to polling, e.g. you could ask the EA Survey team to try some questions on it (maybe for a random subset).
Hi Milan,So far it has been used to back the donor lottery (this has no net $ outlay in expectation, but requires funds to fill out each block and handle million dollars swings up and down), make a grant to ALLFED, fund Rethink Priorities’ work on nuclear war, and small seed funds for some researchers investing two implausible but consequential if true interventions (including the claim that creatine supplements boost cognitive performance for vegetarians).Mostly it remains invested. In practice I have mostly been able to recommend major grants to other funders so this fund is used when no other route is more appealing. Grants have often involved special circumstances or restricted funding, and the grants it has made should not be taken as recommendations to other donors to donate to the same things at the current margin in their circumstances.
There is some effect in this direction, but not a sudden cliff. There is plenty of room to generalize, not an in. We create models of alternative coherent lawlike realities, e.g. the Game of Life or and physicists interested in modeling different physical laws.
Thanks David, this looks like a handy paper!
Given all of this, we’d love feedback and discussion, either as comments here, or as emails, etc.
I don’t agree with the argument that infinite impacts of our choices are of Pascalian improbability, in fact I think we probably face them as a consequence of one-boxing decision theory, and some of the more plausible routes to local infinite impact are missing from the paper:
The decision theory section misses the simplest argument for infinite value: in an infinite inflationary universe with infinite copies of me, then my choices are multiplied infinitely. If I would one-box on Newcomb’s Problem, then I would take the difference between eating the sandwich and not to be scaled out infinitely. I think this argument is in fact correct and follows from our current cosmological models combine with one-boxing decision theories.
Under ‘rejecting physics’ I didn’t see any mention of baby universes, e.g. Lee Smolin’s cosmological natural selection. If that picture were right, or anything else in which we can affect the occurrence of new universes/inflationary bubbles forming, then that would permit infinite impacts.
The simulation hypothesis is a plausible way for our physics models to be quite wrong about the world in which the simulation is conducted, and further there would be reason to think simulations would be disproportionately conducted under physical laws that are especially conducive to abundant computation
Here are two posts from Wei Dai, discussing the case for some things in this vicinity (renormalizing in light of the opportunities):https://www.lesswrong.com/posts/Ea8pt2dsrS6D4P54F/shut-up-and-dividehttps://www.lesswrong.com/posts/BNbxueXEcm6dCkDuk/is-the-potential-astronomical-waste-in-our-universe-too
Thanks for this detailed post on an underdiscussed topic! I agree with the broad conclusion that extinction via partial population collapse and infrastructure loss, rather than by the mechanism of catastrophe being potent enough to leave no or almost no survivors (or indirectly enabling some later extinction level event) has very low probability. Some comments:
Regarding case 1, with a pandemic leaving 50% of the population dead but no major infrastructure damage, I think you can make much stronger claims about there not being ‘civilization collapse’ meaning near-total failure of industrial food, water, and power systems. Indeed, collapse so defined from that stimulus seems nonsensical to me for rich quantitative reasons.
There is no WMD war here, otherwise there would be major infrastructure damage.
If half of people are dead, that cuts the need for food and water by half (doubling per capita stockpiles), while already planted calorie-rich crops can easily be harvested with a half-size workforce.
Today agriculture makes up closer to 5% than 10% of the world economy, and most of that effort is expended on luxuries such as animal agriculture, expensive fruits, avoidable food waste, and other things that aren’t efficient ways to produce nutrition. Adding all energy (again, most of which is not needed for basic survival as opposed to luxuries) brings the total to ~15%, and perhaps 5% on necessities (2.5% for half production for half population). That leaves a vast surplus workforce.
The catastrophe doubles resources of easily accessible fossil fuels and high quality agricultural land per surviving person, so just continuing to run the best 50% of farmland and the best 50% of oil wells means an increase in food and fossil fuels per person.
Likewise, there is a surplus of agricultural equipment, power plants, water treatment plants, and operating the better half of them with the surviving half of the population could improve per capita availability. These plants are parallel and independent enough that running half of them would not collapse productivity, which we can confirm by looking back to when there were half as many, etc.
Average hours worked per capita is already at historical lows, leaving plenty of room for trained survivors to work longer shifts while people switch over from other fields and retrain
Historical plagues such as the Black Death or smallpox in the Americas did not cause a breakdown of food production per capita for the survivors.
Historical wartime production changes show enormous and adequate flexibility in production.
Re the likelihood of survival without industrial agriculture systems, the benchmark should be something closer to preindustrial European agriculture, not hunter-gatherers. You discuss this but it would be helpful to put more specific credences on those alternatives.
The productivity of organic agriculture is still enormously high relative to hunting and gathering.
Basic knowledge about crop rotation, access to improved and global crop varieties such as potatoes, ploughs, etc permitted very high population density before industrial agriculture, with very localized supply chains. One can see this in colonial agricultural communities which could be largely self-sustaining (mines for metal tools being one of the worst supply constraints, but fine in a world where so much metal has already been mined and is just sitting around for reuse).
By the same token, talking about ‘at least 10%’ of 1-2 billion subsistence farmers continuing agriculture is a very low figure. I assume it is a fairly extreme lower bound, but it would be helpful to put credences on lower bounds and to help distinguish them from more likely possibilities.
Re food stockpiles:
“I’m ignoring animal agriculture and cannibalism, in part because without a functioning agriculture system, it’s not clear to me whether enough people would be able to consume living beings.”
Existing herds of farmed animals would likely be killed and eaten/preserved.
If transport networks are crippled, then this could be for local consumption, but that would increase food inequality and likelihood of survival in dire situations
There are about 1 billion cattle alone, with several hundred kg of edible mass each, plus about a billion sheep, ~700 million pigs, and 450 million goats.
In combination these could account for hundreds of billions of human-days of nutritional requirements (I think these make up a large share of ‘global food stocks’ in your table of supplies)
Already planted crops ready to harvest constitute a huge stockpile for the scenarios without infrastructure damage.
Particularly for severe population declines, fishing is limited by fish supplies, and existing fishing boats capture and kill vast quantities of fishes in days when short fishing seasons open. If the oceans are not damaged, this provides immense food resources to any survivors with modern fishing knowledge and some surviving fishing equipment.
“But if it did, I expect that the ~4 billion survivors would shrink to a group of 10–100 million survivors during a period of violent competition for surviving goods in grocery stores/distribution centers, food stocks, and fresh water sources.”
“So what, concretely, do I think would happen in the event of a catastrophe like a “moderate” pandemic — one that killed 50% of people, but didn’t cause infrastructure damage or climate change? My best guess is that civilization wouldn’t actually collapse everywhere. But if it did, I expect that the ~4 billion survivors would shrink to a group of 10–100 million survivors during a period of violent competition for surviving goods in grocery stores/distribution centers, food stocks, and fresh water sources.”
For the reasons discussed above I strongly disagree with the claim after “I expect.”
“All this in mind, I think it is very likely that the survivors would be able to learn enough during the grace period to be able to feed and shelter themselves ~indefinitely.”
I would say the probability should be higher here.
Regarding radioactive fallout, an additional factor not discussed is the decline of fallout danger over time: lethal areas are quite different over the first week vs the first year, etc.
Re Scenario 2: “Given all of this, my subjective judgment is that it’s very unlikely that this scenario would more or less directly lead to human extinction” I would again say this is even less likely.
In general I think extinction probability from WMD war is going to be concentrated in the plausible future case of greatly increased/deadlier arsenals: millions of nuclear weapons rather than thousands, enormous and varied bioweapons arsenals, and billions of anti-population hunter-killer robotic drones slaughtering survivors including those in bunkers, all released in the same conflict.
“Given this, I think it’s fairly likely, though far from guaranteed, that a catastrophe that caused 99.99% population loss, infrastructure damage, and climate change (e.g. a megacatastrohe, like a global war where biological weapons and nuclear weapons were used) would more or less directly cause human extinction.”
This seems like a sign error, differing from your earlier and later conclusions?
“I think it’s fairly unlikely that humanity would go extinct as a direct result of a catastrophe that caused the deaths of 99.99% of people (leaving 800 thousand survivors), extensive infrastructure damage, and temporary climate change (e.g. a more severe nuclear winter/asteroid impact, plus the use of biological weapons).”
It sounds like you’re assuming a common scale between the theories (maximizing expected choice-worthiness)).
A common scale isn’t necessary for my conclusion (I think you’re substituting it for a stronger claim?) and I didn’t invoke it. As I wrote in my comment, on negative utilitarianism s-risks that are many orders of magnitude smaller than worse ones without correspondingly huge differences in probability get ignored for the latter. On variance normalization, or bargaining solutions, or a variety of methods that don’t amount to dictatorship of one theory, the weight for an NU view is not going to spend its decision-influence on the former rather than the latter when they’re both non-vanishing possibilities.
I would think something more like your hellish example + billions of times more happy people would be more illustrative. Some EAs working on s-risks do hold lexical views.
Sure (which will make the s-risk definition even more inapt for those people), and those scenarios will be approximately ignored vs scenarios that are more like 1⁄100 or 1/1000 being tortured on a lexical view, so there will still be the same problem of s-risk not tracking what’s action-guiding or a big deal in the history of suffering.
Just a clarification: s-risks (risks of astronomical suffering) are existential risks.
This is not true by the definitions given in the original works that defined these terms. Existential risk is defined to only refer to things that are drastic relative to the potential of Earth-originating intelligent life:
where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.
Any X-risks are going to be in the same ballpark of importance if they occur, and immensely important to the history of Earth-originating life. Any x-risk is a big deal relative to that future potential.S-risk is defined as just any case where there’s vastly more total suffering than Earth history heretofore, not one where suffering is substantial relative to the downside potential of the future.
S-risks are events that would bring about suffering on an astronomical scale, vastly exceeding all suffering that has existed on Earth so far.
In an intergalactic civilization making heavy use of most stars, that would be met by situations where things are largely utopian but 1 in 100 billion people per year get a headache, or a hell where everyone was tortured all the time. These are both defined as s-risks, but the bad elements in the former are microscopic compared to the latter, or the expected value of suffering. With even a tiny weight on views valuing good parts of future civilization the former could be an extremely good world, while the latter would be a disaster by any reasonable mixture of views. Even with a fanatical restriction to only consider suffering and not any other moral concerns, the badness of the former should be almost completely ignored relative to the latter if there is non-negligible credence assigned to both. So while x-risks are all critical for civilization’s upside potential if they occur, almost all s-risks will be incredibly small relative to the potential for suffering, and something being an s-risk doesn’t mean its occurrence would be an important part of the history of suffering if both have non-vanishing credence.From the s-risk paper:
We should differentiate between existential risks (i.e., risks of “mere” extinction or failed potential) and risks of astronomical suffering1(“suffering risks” or “s-risks”). S-risks are events that would bring about suffering on an astronomical scale, vastly exceeding all suffering that has existed on Earth so far.The above distinctions are all the more important because the term “existential risk” has often been used interchangeably with “risks of extinction”, omitting any reference to the future’s quality.2 Finally, some futures may contain both vast amounts of happiness and vast amounts of suffering, which constitutes an s-risk but not necessarily a (severe) x-risk. For instance, an event that would create 1025 unhappy beings in a future that already contains 1035 happy individuals constitutes an s-risk, but not an x-risk.
We should differentiate between existential risks (i.e., risks of “mere” extinction or failed potential) and risks of astronomical suffering1(“suffering risks” or “s-risks”). S-risks are events that would bring about suffering on an astronomical scale, vastly exceeding all suffering that has existed on Earth so far.
The above distinctions are all the more important because the term “existential risk” has often been used interchangeably with “risks of extinction”, omitting any reference to the future’s quality.2 Finally, some futures may contain both vast amounts of happiness and vast amounts of suffering, which constitutes an s-risk but not necessarily a (severe) x-risk. For instance, an event that would create 1025 unhappy beings in a future that already contains 1035 happy individuals constitutes an s-risk, but not an x-risk.
If one were to make an analog to the definition of s-risk for loss of civilization’s potential it would be something like risks of loss of potential welfare or goods much larger than seen on Earth so far. So it would be a risk of this type to delay interstellar colonization by a few minutes and colonize one less star system. But such ‘nano-x-risks’ would have almost none of the claim to importance and attention that comes with the original definition of x-risk. Going from 10^20 star systems to 10^20 star systems less one should not be put in the same bucket as premature extinction or going from 10^20 to 10^9. So long as one does not have a completely fanatical view and gives some weight to different perspectives, longtermist views concerned with realizing civilization’s potential should give way on such minor proportional differences to satisfy other moral concerns, even though the absolute scales are larger.Bostrom’s Astronomical Waste paper specifically discusses such things, but argues since their impact would be so small relative to existential risk they should not be a priority (at least in utilitarianish terms) relative to the latter.This disanalogy between the x-risk and s-risk definitions is a source of ongoing frustration to me, as s-risk discourse thus often conflates hellish futures (which are existential risks, and especially bad ones), or possibilities of suffering on a scale significant relative to the potential for suffering (or what we might expect), with bad events many orders of magnitude smaller or futures that are utopian by common sense standards and compared to our world or the downside potential.I wish people interested in s-risks that are actually near worst-case scenarios, or that are large relative to the background potential or expectation for downside would use a different word or definition, that would make it possible to say things like ‘people broadly agree that a future constituting an s-risk is a bad one, and not a utopia’ or at least ‘the occurrence of an s-risk is of the highest importance for the history of suffering.’
$1B commitment attributed to Musk early on is different from the later Microsoft investment. The former went away despite the media hoopla.
It’s invested in unleveraged index funds, but was out of the market for the pandemic crash and bought in at the bottom. Because it’s held with Vanguard as a charity account it’s not easy to invest as aggressively as I do my personal funds for donation, in light of lower risk-aversion for altruistic investors than those investing for personal consumption, although I am exploring options in that area.The fund has been used to finance the CEA donor lottery, and to make grants to ALLFED and Rethink Charity (for nuclear war research). However, it should be noted that I only recommend grants for the fund that I think aren’t a better fit for other funding sources I can make recommendations to, and often with special circumstances or restricted funding, and grants it has made should not be taken as recommendations from me to other donors to donate to the same things at the margin. [For the object-level grants, although using donor lotteries is generally sensible for a wide variety of donation views.]
Longtermists sometimes argue that some causes matter extraordinarily more than others—not just thousands of times more, but 10^30 or 10^40 times more.
I don’t think any major EA or longtermist institution believes this about expected impact for 10^30 differences. There are too many spillovers for that, e.g. if doubling the world economy of $100 trillion/yr would modestly shift x-risk or the fate of wild animals, then interventions that affect economic activity have to have expected absolute value of impact much greater than 10^-30 of the most expected impactful interventions.
This argument requires that causes differ astronomically in relative cost-effectiveness. If causes A is astronomically better than cause B in absolute terms, but cause B is 50% as good in relative terms, then it makes sense for me to take a job in cause B if I can be at least twice as productive.
I suspect that causes don’t differ astronomically in cost-effectiveness. Therefore, people should pay attention to personal fit when choosing an altruistic career, and not just the importance of the cause.
The premises and conclusion don’t seem to match here. A difference of 10^30x is crazy, but rejecting that doesn’t mean you don’t have huge practical differences in impact like 100x or 1000x. Those would be plenty to come close to maxing out the possible effect of differences between causes(since if you’re 1000x as good at rich-country homelessness relief as preventing pandemics, then if nothing else your fame for rich country poverty relief would be a powerful resource to help out in other areas like public endorsements of good anti-pandemic efforts).The argument seems sort of like “some people say if you go into careers like quant trading you’ll make 10^30 dollars and can spend over a million dollars to help each animal with a nervous system. But actually you can’t make that much money even as a quant trader, so people should pay attention to fit with different careers in the world when trying to make money, since you can make more money in a field with half the compensation per unit productivity if you are twice as productive there.” The range for realistic large differences in compensation between fields (e.g. fast food cashier vs quant trading) is missing from the discussion.You define astronomical differences at the start as ‘not just thousands of times more’ but the range to thousands of times more is where all the action is.