RSS

# Tristan Cook

Karma: 476

I am a research analyst at the Center on Long-Term Risk.

I previously studied maths at the University of Cambridge and University of Warwick.

Send me anonymous feedback or messages here https://​​www.admonymous.co/​​tristancook

• Thanks!

And thanks for the suggestion, I’ve created a version of the model using a Monte Carlo simulation here :-)

• This is a short follow up to my post on the optimal timing of spending on AGI safety work which, given exact values for the future real interest, diminishing returns and other factors, calculated the optimal spending schedule for AI risk interventions.

This has also been added to the post’s appendix and assumes some familiarity with the post.

Here I consider the most robust spending policies and supposes uncertainty over nearly all parameters in the model[1] Inputs that are not considered include: historic spending on research and influence, rather than finding the optimal solutions based on point estimates and again find that the community’s current spending rate on AI risk interventions is too low.

My distributions over the the model parameters imply that

• Of all fixed spending schedules (i.e. to spend X% of your capital per year[2]), the best strategy is to spend 4-6% per year.

• Of all simple spending schedules that consider two regimes: now until 2030, 2030 onwards, the best strategy is to spend ~8% per year until 2030, and ~6% afterwards.

I recommend entering your own distributions for the parameters in the Python notebook here[3]. Further, these preliminary results use few samples: more reliable results would be obtained with more samples (and more computing time).

I allow for post-fire-alarm spending (i.e., we are certain AGI is soon and so can spend some fraction of our capital). Without this feature, the optimal schedules would likely recommend a greater spending rate.

Caption: Fixed spending rate. See here for the distributions of utility for each spending rate.

Caption: Simple - two regime - spending rate

Caption: The results from a simple optimiser[4], when allowing for four spending regimes: 2022-2027, 2027-2032, 2032-2037 and 2037 onwards. This result should not be taken too seriously: more samples should be used, the optimiser runs for a greater number of steps and more intervals used. As with other results, this is contingent on the distributions of parameters.

### Some notes

• The system of equations—describing how a funder’s spending on AI risk interventions change the probability of AGI going well—are unchanged from the main model in the post.

• This version of the model randomly generates the real interest, based on user inputs. So, for example, one’s capital can go down.

Caption: An example real interest function , cherry picked to show how our capital can go down significantly. See here for 100 unbiased samples of .

Caption: Example probability-of-success functions. The filled circle indicates the current preparedness and probability of success.

Caption: Example competition functions. They all pass through (2022, 1) since the competition function is the relative cost of one unit of influence compared to the current cost.

This short extension started due to a conversation with David Field and comment from Vasco Grilo; I’m grateful to both for the suggestion.

1. ^

Inputs that are not considered include: historic spending on research and influence, the rate at which the real interest rate changes, the post-fire alarm returns are considered to be the same as the pre-fire alarm returns.

2. ^

And supposing a 50:50 split between spending on research and influence

3. ^

This notebook is less user-friendly than the notebook used in the main optimal spending result (though not un user friendly) - let me know if improvements to the notebook would be useful for you.

4. ^

The intermediate steps of the optimiser are here.

• 25 Nov 2022 16:19 UTC
4 points
0 ∶ 0
in reply to: Jason’s comment

Previously the benefactor has been Carl Shulman (and I’d guess he is again, but this is pure speculation). From 2019-2020 donor lottery page:

Carl Shulman will provide backstop funding for the lotteries from his discretionary funds held at the Centre for Effective Altruism.

The funds mentioned are likely these $5m from March 2018: The Open Philanthropy Project awarded a grant of$5 million to the Centre for Effective Altruism USA (CEA) to create and seed a new discretionary fund that will be administered by Carl Shulman

• This is great to hear! I’m personally more excited by quality-of-life improvement interventions rather than saving lives so really grateful for this work.

Echoing kokotajlod’s question for GiveWell’s recommendations, do you have a sense of whether your recommendations change with a very high discount rate (e.g. 10%)? Looking at the graph of GiveDirectly vs StrongMinds it looks like the vast majority of benefits are in the first ~4 years

Minor note: the link at the top of the page is broken (I think the 1123 in the URL needs to be changed to 1124)

• When LessWrong posts are crossposted to the EA Forum, there is a link in EA Forum comments section:

This link just goes to the top of the LessWrong version of the post and not to the comments. I think either the text should be changed or the link go to the comments section.

• (minor point that might help other confused people)

I had to google CMO (which I found to mean Chief Marketing Officer) and also thought that BOAS might be an acronym—but found on your website

BOAS means good in Portuguese, clearly explaining what we do in only four letters!

• Increasing/​decreasing one’s AGI timelines decrease/​increase the importance [1] of non-AGI existential risks because there is more/​less time for them to occur[2].

Further, as time passes and we get closer to AGI, the importance of non-AI x-risk decreases relative to AI x-risk. This is a particular case of the above claim.

1. ^

but not necessarily tractability & neglectedness

2. ^

If we think that nuclear/​bio/​climate/​other work becomes irrelevant post-AGI, which seems very plausible to me

• These seem neat! I’d recommend posting them to the EA Forum—maybe just as a shortform—as well as on your website so people can discuss the thoughts you’ve added (or maybe even posting the thoughts on your shortform with a link to your summary).

For a while I ran a podcast discussion meeting at my local group and I think summaries like this would have been super useful to send to people who didn’t want to /​ have time to listen. As a bonus—though maybe too much effort—would be generating discussion prompts based on the episode.

# The op­ti­mal timing of spend­ing on AGI safety work; why we should prob­a­bly be spend­ing more now

24 Oct 2022 17:42 UTC
78 points
9 comments36 min readEA link
• This looks exciting!

The application form link doesn’t currently work.

• I highly recommend Nick Bostrom’s working paper Base Camp for Mt. Ethics.

Some excerpts on the idea of the cosmic host that I liked most:

34. At the highest level might be some normative structure established by what we may term the cosmic host. This refers to the entity or set of entities whose preferences and concordats dominate at the largest scale, i.e. that of the cosmos (by which I mean to include the multiverse and whatever else is contained in the totality of existence). It might conceivably consist of, for example, galactic civilizations, simulators, superintelligences, or a divine being or beings.

39. One might think that we could have no clue as to what the cosmic norms are, but in fact we can make at least some guesses:

a. We should refrain from harming or disrespecting local instances of things that the cosmic host is likely to care about.

b. We should facilitate positive-sum cooperation, and do our bit to uphold the cosmic normative order and nudge it in positive directions.

c. We should contribute public goods to the cosmic resource pool, by securing resources and (later) placing them under the control of cosmic norms. Prevent xrisk and build AI?

d. We should be modest, willing to listen and learn. We should not too headstrongly insist on having too much our way. Instead, we should be compliant, peace-loving, industrious, and humble vis-a-vis the cosmic host.

41. Maybe this could itself be part of an alignment goal: to build our AI such that it wants to be a good cosmic citizen and comply with celestial morality.

a. We may also want it to cherish its parents and look after us in our old age. But a little might go a long way in that regard.

• I’ve been building a model to calculate the optimal spending schedule on AGI safety and am looking for volunteers to run user experience testing.

Let me know via DM on the forum or email if you’re interested :-)

The only requirements are (1) to be happy to call & share your screen for ~20 to ~60 minutes while you use the model (a Colab notebook which runs in your browser) and (2) some interest in AI safety strategy (but certainly no expertise necessary)

• Thanks for writing this! I think you’re right that if you buy the Doomsday argument (or assumptions that lead to it) then we should update against worlds with 10^50 future humans and towards worlds with Doom-soon.

However, you write

My take is that the Doomsday Argument is … but it follows from the assumptions outlined

which I don’t think is true. For example, your assumptions seem equally compatible with the self-indication assumption (SIA) that doesn’t predict Doom-soon.[1]

I think a lot of confusions in anthropics go away when we convert probability questions to decision problem questions. This is what Armstrong’s Anthropic Decision Theory does.

Interestingly, something like the Doomsday argument applies for average utilitarians: they bet on Doom-soon, since in this case they win the bet the utility is spread over much fewer people.

1. ^

Katja Grace has written about SIA Doomsday but this is (in my view) contingent on beliefs about aliens & simulations whereas SSA Doomsday is not.

• Thanks for your response Robin.

I stand by the claim that both (updating on the time remaining) and (considering our typicality among all civilizations) is an error in anthropic reasoning, but agree there are non-time remaining reasons reasons to expect (e.g. by looking at steps on the evolution to intelligent life and reasoning about their difficulties). I think my ignorance based prior on was naive for not considering this.

I will address the issue of the compatibility of high and high by looking at the likelihood ratios of pairs of (, ).

I first show a toy model to demonstrate that the deadline effect is weak (but present) and then reproduce the likelihood ratios from my Bayesian model.

My toy model (code here)

I make the following simplifications in this toy likelihood ratio calculation

• There are two types of habitable planets: those that are habitable for and those that are habitable for

• Given a maximum habitable planet duration , I suppose there are approximately planets in the observable universe of this habitability.

• The universe first becomes habitable at 5 Gy after the Big Bang.

• All ICs become GCs.

• There are no delay steps.

• There are no try-once steps.[1]

And the most important assumptions:

• If there are at least 5000 GCs[2] after the Big Bang I suppose a toy ‘deadline effect’ is triggered and no ICs or GCs appear after

• If there are at least 1000 GCs after the Big Bang I suppose Earth-like life is precluded (these cases have zero likelihood ratio).

Writing & for the PDF & CDF of the Gamma distribution parameters n and h:

• If then there are too many GCs and humanity is precluded.

• The likelihood ratio is 0.

• If the first case is false and the toy deadline effect is triggered and I suppose no more GCs arrive.

• The likelihood ratio is directly proportional to

• Otherwise, there is no deadline effect and life is able to appear late into the universe.

• The likelihood ratio is directly proportional, with the same constant of proportionality as above, to

Toy model results

Above: some plot of the likelihood ratios of pairs of . The three plots vary by value of (the geometric mean of the hardness of the steps).

The white areas show where life on Earth is precluded by early arriving GCs.

The contour lines show the (expected) number of GCs at 20 Gy. For more than 1000 GCs at 20 Gy (but not so many to preclude human-like civilizations) there are relatively high likelihood ratios.

For example, has a likelihood ratio 1e-24. In this case there are around 5200 GCs existing by 20 Gy, so the toy deadline effect is triggered. However, this likelihood ratio is much smaller than the case [just shifting along to the left] which has ratio 1e-20. In this latter case, there are 0.4 expected GCs by 20 Gy, so the toy deadline effect is not triggered.

Toy model discussion

This toy deadline effect that can be induced by higher (holding other parameters constant) is not strong enough to compete with humanity becoming increasingly atypical when. Increasing makes Earthlike civilizations less typical by

(1) the count effect, where there are more longer lived planets than shorter lived ones like Earth. In the toy model, this effect accounts for decrease in typicality. This effect is relatively weak.

(2) the power law effect, where the greater habitable duration allows for more attempts at completing the hard steps. When there is a deadline effect at 20 Gy (say) and the universe is habitable from 5 Gy, any planet that is habitable for at least 15 Gy has three times the duration for an IC or GC to appear. For sufficiently hard steps, this effect is roughly decreases life on Earth’s typicality by . This effect is weaker when the deadline is set earlier (when there are faster moving GCs).

This second effect can be strong, dependent on when the deadline occurs. At a minimum, if the universe has been habitable since 5 Gy after the Big Bang the deadline effect could occur in the next 1 Gy and there would still be a typicality decrease. The power law effect’s decrease on our typicality can be minimised if all planets only became habitable at the same time as Earth.[3]

This toy deadline effect model also shows that it only happens in a very small part of the sample space. This motivates (to me) why the posterior on is so pushed away from high values.

Likelihoods from the full model

I now show the likelihood ratios for pairs having fixed the other parameters in my model. I set

• the probability of passing through all try-once steps

• the parameter that controls the habitability of the early universe

• the delay steps to take .

The deadline effect is visible in all cases. For example, the first graph in 1) with has with likelihood ratio 1.5e-25. Although this is high relative to other likelihood ratios with it is small compared to moving to the left to which has likelihood ratio 3e-22.

1)

The plots differ by value of

2)

3)

4)

1. ^

For SSA-like updates, this does not in fact matter

2. ^

Chosen somewhat arbitrarily. I don’t think the exact number matters though it should higher for slower expanding GCs.

3. ^

This also depends on our reference class. I consider the reference class of observers in all ICs, but if we restrict this to observes in ICs that do not observe GCs then we also require high expansion speeds

• I recently wrote about how AGI timelines change the relative value of ‘slow’ acting neartermist interventions relative to ‘fast’ acting neartermist interventions.

It seems to me that EAs in other cause areas mostly ignore this, though I haven’t looked into this too hard.

My (very rough) understanding of Open Philanthropy’s worldview diversification approach is that the Global Health and Wellbeing focus area team operates on both (potentially) different values and epistemic approaches to the Longtermism focus area team. The epistemic approach of the former seems more reliant on more “common sense” ways to do good.

• Thanks for running the survey, I’m looking forward to seeing results!

I’ve filled out the form but find some of the potential arguments problematic. It could be worth to seeing how persuasive others find these arguments but I would be hesitant to promote arguments that don’t seem robust. In general, I think more disjunctive arguments work well.

For example, (being somewhat nitpicky):

Everyone you know and love would suffer and die tragically.

Some existential catastrophes could happen painlessly and quickly .

We would destroy the universe’s only chance at knowing itself...

Aliens (maybe!) or (much less likely imo) another intelligent species evolving on Earth

There are co-benefits to existential risk mitigation: prioritizing these risks means building better healthcare infrastructure, better defense against climate change, etc.

It seems that work on biorisk prevention does involve “building better healthcare infrastructure” but is maybe misleading to characterise it in this way since I imagine people think of something different when they hear the term. There are also drawbacks to some (proposed) existential risk mitigation interventions.