I am a research analyst at the Center on Long-Term Risk.
I’ve worked on grabby aliens, the optimal spending schedule for AI risk funders, and evidential cooperation in large worlds.
Some links
I am a research analyst at the Center on Long-Term Risk.
I’ve worked on grabby aliens, the optimal spending schedule for AI risk funders, and evidential cooperation in large worlds.
Some links
I recently wrote about how AGI timelines change the relative value of ‘slow’ acting neartermist interventions relative to ‘fast’ acting neartermist interventions.
It seems to me that EAs in other cause areas mostly ignore this, though I haven’t looked into this too hard.
My (very rough) understanding of Open Philanthropy’s worldview diversification approach is that the Global Health and Wellbeing focus area team operates on both (potentially) different values and epistemic approaches to the Longtermism focus area team. The epistemic approach of the former seems more reliant on more “common sense” ways to do good.
Thanks for running the survey, I’m looking forward to seeing results!
I’ve filled out the form but find some of the potential arguments problematic. It could be worth to seeing how persuasive others find these arguments but I would be hesitant to promote arguments that don’t seem robust. In general, I think more disjunctive arguments work well.
For example, (being somewhat nitpicky):
Everyone you know and love would suffer and die tragically.
Some existential catastrophes could happen painlessly and quickly .
We would destroy the universe’s only chance at knowing itself...
Aliens (maybe!) or (much less likely imo) another intelligent species evolving on Earth
There are co-benefits to existential risk mitigation: prioritizing these risks means building better healthcare infrastructure, better defense against climate change, etc.
It seems that work on biorisk prevention does involve “building better healthcare infrastructure” but is maybe misleading to characterise it in this way since I imagine people think of something different when they hear the term. There are also drawbacks to some (proposed) existential risk mitigation interventions.
I definitely agree people should be thinking about this! I wrote about something similar last week :-)
Is there a better word than ‘sustenance’ for outcomes where humanity does not suffer a global catastrophe?
There is some discussion here about such a term
Surely most neartermist funders think that the probability that we get transformative AGI this century is low enough that it doesn’t have a big impact on calculations like the ones you describe?
I agree with Thomas Kwa on this
There are a couple views by which neartermism is still worthwhile even if there’s a large chance (like 50%) that we get AGI soon -- …
I think neartermist causes are worthwhile in their own right, but think some interventions are less exciting when (in my mind) most of the benefits are on track to come after AGI.
The idea that a neartermist funder becomes convinced that world-transformative AGI is right around the corner, and then takes action by dumping all their money into fast-acting welfare enhancements, instead of trying to prepare for or influence the immense changes that will shortly occur, almost seems like parody
Fair enough. My prediction is that the idea will become more palatable over time as we get closer to AGI in the next few years. Even if there is a small chance we have the opportunity to do this, I think it could be worthwhile to think further given the amount of money earmarked for spending on neartermist causes.
Thanks for writing the post :-)
I think I’m confused that I expected the post (going by the title) to say something like “even if you think AI risk by year Y is X% or greater, you maybe shouldn’t change your life plans too much” but instead you’re saying “AI risk might be lower than you think, and at a low level it doesn’t affect your plans much” and then give some good considerations for potentially lower AI x-risk.
You can react to images without text. But you need to tap on the side of the image, since tapping on the image itself maximizes it
Thanks, this is useful to know!
2x speed on voice messages
Just tested, and Signal in fact has this feature.
I’d also add in Telegram’s favour
Web based client (https://web.telegram.org/) whereas Signal requires an installed app for some (frustrating) reason
and in Signal’s favour
Any emoji reaction available (Telegram you have to pay for extra reacts) [this point leads me to worry Telegram will become more out-to-get-me over time]
Less weird behaviour (e.g. in Telegram, I can’t react to images that are sent without text & in some old group chats I can’t react to anything)
(I am neither a fan of Signal not Telegram, but wanted to add to the list. I haven’t seen Element discussed at all and weakly prefer it over both Signal and Telegram)
This LessWrong post had some good discussion about some of the same ideas :-)
An advanced civilization from outer space could easily colonize our planet and enslave us as Columbus enslaved the Indigenous tribes of the Americas
I think this is unlikely, since my guess is that (if civilization continues on Earth) we’ll reach technological maturity in a much shorter time than we expect to meet aliens (I consider the time until we meet aliens here).
Thanks for putting this together!
The list of people on the google form and the list in this post don’t match (e.g. Seren Kell is on the post but not on the form and vice versa for David Manheim and Zachary Robinson)
I’d add another benefit that I’ve not seen in the other answers: deciding on the curriculum and facilitating yourself get you to engage (critically) with a lot with EA material. Especially for the former you have to think about the EA idea-space and work out a path through it all for fellows.
I helped create a fellowship curriculum (mostly a hybrid of two existing curricula iirc) before there were virtual programs or and this definitely got me more involved with EA. Of course, there may be a trade-off in quality.
I agree with what you say, though would note
(1) maybe doom should be disambiguated between “the short-lived simulation that I am in is turned of”-doom (which I can’t really observe) and “the basement reality Earth I am in is turned into paperclips by an unaligned AGI”-type doom.
(2) conditioning on me being in at least one short-lived simulation, if the multiverse is sufficiently large and the simulation containing me is sufficiently ‘lawful’ then I may also expect there to be basement reality copies of me too. In this case, doom is implied for (what I would guess is) most exact copies of me.
Thanks for this post! I’ve been meaning to write something similar, and have glad you have :-)
I agree with your claim that most observers like us (who believe they are at the hinge of history) are in (short-lived) simulations. Brian Tomasik discusses how this marginally makes one value interventions with short-term effects.
In particular, if you think the simulations won’t include other moral patients simulated to a high resolution (e.g. Tomasik suggests this may be the case for wild animals in remote places), you would instrumentally care less about their welfare (since when you act to increase their welfare, this may only have effects in basement reality as well as the more expensive simulations that do simulate such wild animals) . At the extreme is your suggestion, where you are the only person in the simulation and so you may act as a hedonist! Given some uncertainty over the distribution of “resolution of simulations”, it seems likely that one should still act altruistically.
I disagree with the claim that if we do not pursue longtermism, then no simulations of observers like us will be created. For example, I think an Earth-originating unaligned AGI would still have instrumental reasons to run simulations of 21st century Earth. Further, alien civilizations may have interest to learn about other civilizations.
Under your assumptions, I don’t think this is a Newcomb-like problem. I think CDT & EDT would agree on the decision,[1] which I think depends on the number of simulations and the degree to which the existence of a good longterm future hinges your decisions. Supposing humanity only survives if you act as a longtermist and simulations of you are only created if humanity survives, then you can’t both act hedonistically and be in a simulation.
This tool is impressive, thanks! I like the framing you use of safety as a race against capabilities, though think don’t really know what it would look like to have “solved ” AGI safety 20 years before AGI. I also appreciate all the assumptions being listed at the end of the page.
Some minor notes
the GitHub link in the webpage footer points to the wrong page
I think two of the prompts “How likely is it to work?” and “How much do you speed it up?” would be made clearer if “it” was replaced by AGI safety (if that is what it is referring to).
Thanks for this post! I used to do some voluntary university community building, and some of your insights definitely ring true to me, particularly the Alice example—I’m worried that I might have been the sort of facilitator to not return to the assumptions in fellowships I’ve facilitated.
A small note:
Well, the most obvious place to look is the most recent Leader Forum, which gives the following talent gaps (in order):
This EA Leaders Forum was nearly 3 years ago, and so talent gaps have possibly changed. There was a Meta Coordination Forum last year run by CEA, but I haven’t seen any similar write-ups. This doesn’t seem to be an important crux for most of your points, but thought would be worth mentioning.
This definitely sounds like a better approach than mine, thanks for sharing! This will be useful for me for any future projects
Thanks for your questions and comments! I really appreciate someone reading through in such detail :-)
What is the highest probability of encountering aliens in the next 1000 years according to reasonable choices once could make in your model?
SIA (with no simulations) gives the nearest and most numerous aliens.
My bullish prior (which has a priori has 80% credence in us not being alone) with SIA and the assumption that grabby aliens are hiding gives a median of ~ chance in a grabby civilization reaching us in the next 1000 years.
I don’t condition on us not having any ICs in our past light cone. When conditioning on not being inside a GC, SIA is pretty confident (~80% certain) that we have at least one IC (origin planet) in our past light cone. When conditioning on not seeing any GCs, SIA thinks ~50% that there’s at least one IC in our past light cone. Even if there origin planet is in our light cone, they may already be dead.
Sometimes you just give a prior, e.g., your prior on d, where I don’t really know where it comes from. If it wouldn’t take too much time, it might be worth it to quickly motivate them (e.g., “I think that any interval between x and y would be reasonable because of such and such, and I fitted a lognormal”. It’s possible I’m just missing something obvious to those familiar with the literature.
Thanks for the suggestion, this was definitely an oversight. I’ll add in some text to motivate each prior.
My prior for , the sum of delay and fuse steps: by definition it is bounded above by the time until now and bounded below by zero.
I set the median to ~0.5 Gy. The median is both to account for the potential delay in the Earth first becoming habitable (since the range of estimates around the first life appearing is ~600 My) and be roughly in line with estimates for the time that plants took to oxygenate the atmosphere (a potential delay/fuse step) .
My prior, , roughly fits these criteria
My prior for is pretty arbitrarily chosen. Here’s a post-hoc (motivated) semi-justification for the prior. Wikipedia discusses ~8 possible factors for Rare Earths. If there are necessary Rare-Earth like factors for life, each with fraction of planets having the property, then my prior on isn’t awfully off.
If one thinks that between 0.1 and 1 fraction of all planets have each of the eight factors (and they are independent) something roughly similar to my prior distribution follows.
My prior for , the early universe habitability factor was mostly chosen arbitrarily. My prior implies a median time of ~10 Gy for the universe to be 50% habitable (i.e. the earliest time when habitable planets are in fact habitable due to the absence of gamma ray bursts). In hindsight, I’d probably choose a prior for u that implied a smaller median.
My prior for , the fraction of ICs that become GCs:
It is bounded below by 0.01, mostly to improve the Monte Carlo reliability in cases where smalleris greatly preferred
Has a median of ~0.5. A Twitter poll from Robin Hanson ran gave [I can’t find the reference right now].
Lots of the priors aren’t super well founded. Fortunately, if you think my bounds on each parameter is reasonable, I get the same conclusions when taking a joint prior that is uniform on and log-uniform in all other parameters.
Do you think your conclusion (e.g., around likelihood of observing GCs) would change significantly if “non-terrestrial” planets were habitable?
Good question. In a hack-y and unsatisfactory way, my model does allow for this:
If the ratio of non-terrestrial (habitable) planets to terrestrial (habitable) planets is , they replace the product of try-once steps with to account for the extra planets. (My prior on is bounded above by 1, but this could be easily changed). This approach would also suppose that non-terrestrial planets had the same distribution of habitable lifetimes as terrestrial ones.
Having said that, I don’t think a better approach would change the results for the SIA and ADT updates. For SSA, the habitability of non-terrestrial planets makes civs like us more atypical (since we are on a terrestrial planet). If this atypicality applies equally in worlds with many GCs and worlds with very few GCs, then I doubt it would change the results. All the anthropic theories would update strongly against the habitability of non-terrestrial planets.
Typos:
Thanks!
Great to see this work!
Thanks!
Re the SIA Doomsday argument, I think that is self-undermining for reasons I’ve argued elsewhere.
I agree. When I model the existence of simulations like us, SIA does not imply doom (as seen in the marginalised posteriors for in the appendix here).
Further, the simulation case, SIA would prefer human civilization to be atypically likely to become a grabby civilization (this does not happen in my model as I suppose all civs have the same transition chance to become grabby).
Re the habitability of planets, I would not just model that as lifetimes, but would also consider variations in habitability/energy throughput at a given time
...
Smaller stars may have longer habitable windows but also smaller values for V and M. This sort of consideration limits the plausibility of red dwarf stars being dominant, and also allows for more smearing out of ICs over stars with different lifetimes as both positive and negative factors can get taken to the same power.
I’d definitely like to see this included in future models (I’m surprised Hanson didn’t write about this in his Loud aliens paper). My intuition is that this changes little for the conclusions of SIA or anthropic decision theory with total utilitarianism, and that this weakens the case for many aliens for SSA, since our atypicality (or earliness) is decreased if we expect habitable planets around longer lived stars to have smaller volumes and/or lower metabolisms.
I’d also add, per Snyder-Beattie, catastrophes as a factor affecting probability of the emergence of life and affecting times of IC emergence.
I hadn’t seen this before, thanks for sharing! I’ve skimmed through and found it interesting, though I’m suspicious that at times it uses SSA -with reference class of observers on planets as habitable as long as Earth - type reasoning.
Thanks for your response Robin.
I stand by the claim that both (updating on the time remaining) and (considering our typicality among all civilizations) is an error in anthropic reasoning, but agree there are non-time remaining reasons reasons to expect n>3 (e.g. by looking at steps on the evolution to intelligent life and reasoning about their difficulties). I think my ignorance based prior on n was naive for not considering this.
I will address the issue of the compatibility of high n and high Lmax by looking at the likelihood ratios of pairs of (n, Lmax).
I first show a toy model to demonstrate that the deadline effect is weak (but present) and then reproduce the likelihood ratios from my Bayesian model.
My toy model (code here)
I make the following simplifications in this toy likelihood ratio calculation
There are two types of habitable planets: those that are habitable for 5 Gy and those that are habitable for Lmax>5 Gy
Given a maximum habitable planet duration Lmax, I suppose there are approximately T(Lmax)≈5⋅1018⋅(Lmax in Gy)0.5 planets in the observable universe of this habitability.
The universe first becomes habitable at 5 Gy after the Big Bang.
All ICs become GCs.
There are no delay steps.
There are no try-once steps.[1]
And the most important assumptions:
If there are at least 5000 GCs[2] 20 Gy after the Big Bang I suppose a toy ‘deadline effect’ is triggered and no ICs or GCs appear after 20 Gy
If there are at least 1000 GCs 10 Gy after the Big Bang I suppose Earth-like life is precluded (these cases have zero likelihood ratio).
Writing f(t) & F(t)for the PDF & CDF of the Gamma distribution parameters n and h:
If T(Lmax)F(5)>1000 then there are too many GCs and humanity is precluded.
The likelihood ratio is 0.
If the first case is false and T(Lmax)F(15)+T(5)F(5)>5000 the toy deadline effect is triggered and I suppose no more GCs arrive.
The likelihood ratio is directly proportional to f(4)/(T(Lmax)F(15)+T(5)F(5))
Otherwise, there is no deadline effect and life is able to appear late into the universe.
The likelihood ratio is directly proportional, with the same constant of proportionality as above, to f(4)/(T(Lmax)F(Lmax)+T(5)F(5))
Toy model results
Above: some plot of the likelihood ratios of pairs of (n,Lmax). The three plots vary by value of h (the geometric mean of the hardness of the steps).
The white areas show where life on Earth is precluded by early arriving GCs.
The contour lines show the (expected) number of GCs at 20 Gy. For more than 1000 GCs at 20 Gy (but not so many to preclude human-like civilizations) there are relatively high likelihood ratios.
For example, (h=1000 Gy,n=7,Lmax=1000 Gy) has a likelihood ratio 1e-24. In this case there are around 5200 GCs existing by 20 Gy, so the toy deadline effect is triggered. However, this likelihood ratio is much smaller than the case (h=1000 Gy,n=7,Lmax=10 Gy) [just shifting along to the left] which has ratio 1e-20. In this latter case, there are 0.4 expected GCs by 20 Gy, so the toy deadline effect is not triggered.
Toy model discussion
This toy deadline effect that can be induced by higher Lmax (holding other parameters constant) is not strong enough to compete with humanity becoming increasingly atypical whenLmax. Increasing Lmax makes Earthlike civilizations less typical by
(1) the count effect, where there are more longer lived planets than shorter lived ones like Earth. In the toy model, this effect accounts for T(Lmax)/T(5)≈√5/Lmax decrease in typicality. This effect is relatively weak.
(2) the power law effect, where the greater habitable duration allows for more attempts at completing the hard steps. When there is a deadline effect at 20 Gy (say) and the universe is habitable from 5 Gy, any planet that is habitable for at least 15 Gy has three times the duration for an IC or GC to appear. For sufficiently hard steps, this effect is roughly decreases life on Earth’s typicality by (15/5)n. This effect is weaker when the deadline is set earlier (when there are faster moving GCs).
This second effect can be strong, dependent on when the deadline occurs. At a minimum, if the universe has been habitable since 5 Gy after the Big Bang the deadline effect could occur in the next 1 Gy and there would still be a (10/5)n typicality decrease. The power law effect’s decrease on our typicality can be minimised if all planets only became habitable at the same time as Earth.[3]
This toy deadline effect model also shows that it only happens in a very small part of the sample space. This motivates (to me) why the posterior on Lmax is so pushed away from high values.
Likelihoods from the full model
I now show the likelihood ratios for pairs (n,Lmax) having fixed the other parameters in my model. I set
the probability of passing through all try-once steps w=1
the parameter that controls the habitability of the early universe u=10−5
the delay steps to take 1 Gy.
The deadline effect is visible in all cases. For example, the first graph in 1) with h=104 Gy,v=c,fGC=1 has(n=5,Lmax=104Gy) with likelihood ratio 1.5e-25. Although this is high relative to other likelihood ratios with Lmax=104Gy it is small compared to moving to the left to (n=5,Lmax=5) which has likelihood ratio 3e-22.
1) v=c,fGC=1
The plots differ by value of h
2) v=c,fGC=0.1
3) v=0.1c,fGC=0.1
4) v=0.1c,fGC=0.1
For SSA-like updates, this does not in fact matter
Chosen somewhat arbitrarily. I don’t think the exact number matters though it should higher for slower expanding GCs.
This also depends on our reference class. I consider the reference class of observers in all ICs, but if we restrict this to observes in ICs that do not observe GCs then we also require high expansion speeds