I was not aware of this effort and I appreciate the information about Kindertransport. However, using it as a justification for open borders—a radical idea that seems like it could lead to disaster if it were actually enacted and maintained—seems off-base. Kindertransport is a case in point for permissive refugee policy, not unlimited economic migration. That part felt a bit like a bait and switch, which I didn’t appreciate.
Harrison Durland
[Question] How/When Should One Introduce AI Risk Arguments to People Unfamiliar With the Idea?
i think any estimate would have a confidence interval so wide that it would be useless. (I said “variance” before; maybe that’s a less well known term)
I am aware of what you mean by variance, but I don’t think this challenges my point: I dispute the idea that you can both say “we can’t make any useful estimate on the likelihood of success” and still claim “it’s worth funding (despite any opportunity costs and other potential drawbacks).”
As the rest of this comment gets into, even a really wide (initial/early-stage) confidence interval can be useful as long as the other variables involved are sufficiently large that you can credibly say “it seems very likely that the probability is at least X%, which is enough to make this very cost effective in expectation.”
(This line of reasoning is very pronounced in longtermism)
Curious where the crux of our disagreement is: Would you agree that some things that can’t be measured are still worth doing? And is your belief also that pushing the abundance agenda can’t possibly be more cost-effective than donations to AMF?
I think one crux/sticking point for me is: I believe that you could make a highly-simplistic but illustrative 3-variable plausibility model involving the following questions:
How much funding/resources should be devoted
What is the probability of achieving X outcome if we devote the above-given amount of resources
How valuable is X outcome (e.g., in terms of QALYs).
This is obviously oversimplified (the actual claims are more distributions rather than point estimates), but it requires you to explicate/stake claims like “even under conservative assumptions X, Y, and Z, the expected value of this intervention is still really large.” Relatedly, it allows you to establish breakeven points. Consider the following:
Let’s suppose you claim achieving some policy agenda outcome would produce somewhere between $1T and $10T of value.
Suppose you argue that spending $100M on some kind of movement/systemic change campaign would increase the likelihood of achieving that outcome by somewhere between 0.1% and 10%.
Those confidence intervals are rather large (the probability estimate spans two orders of magnitude), but even with such wide confidence intervals you can claim that a conservative estimate of the expected value is “at least $1B,” which is “at least a 10x return on investment.” And that’s a claim that I and others can at least dissect.
However, my concern/suspicion is that upon explicating these estimates, the “conservative estimate” of expected value will actually not look very large—and in fact I suspect that even my median estimate will probably be lower than global health and development charities.
Would you agree that some things that can’t be measured are still worth doing?
I would push back against the focus on the word “measured” here: “measured” typically is used to refer to estimates which are so objective, verifiable, and/or otherwise defensible that they get thought of as this special category of knowledge, like “we’ve empirically measured and verified that the average return on investment is X.”
I wholly agree that some things which can’t be “measured” are still worth doing, nor are measurements infallible. It’s not about measurements, it’s about estimates. Going back to the point I made at the beginning, the problem I see with your stance is that (based on my limited interaction here) you seem to both be asserting that no reliable estimates can be made, yet asserting a claim that your estimate finds it is worthwhile. But I’m unclear on what your estimate is, and thus I can’t evaluate it.
Regarding “luck,” I will just redirect back to my claim about breakeven points and reference class estimations: does the reliance on “luck” (fortunate circumstances) set the overall likelihood of success at something like 1%? 0.1%?
What is the breakeven point? And does a quick review of the historical frequency of such “luck” produce an estimate which exceeds that conservative breakeven point?
I’m a bit confused on where you stand on this: on the one hand, you seem to be suggesting that it’s not possible to derive a decent estimate on the likelihood of success, but on the other hand you are still suggesting that you think it is worth funding.
I don’t dispute that it can be hard to do “accurate” analysis—e.g., to even be within an order of magnitude of accuracy on certain probability or effect-size estimates—but the key behind various back-of-the-envelope calculations (BOTECs) is getting a rough sense of “does this seem to be at least 10:1 expected return on investment given XYZ explicit, dissectible assumptions?”
If the answer is yes, then that’s an important signal saying “this is worth deeper-than-BOTEC-level analysis.” Certain cause areas like AI safety/governance, biosecurity, and a few others have passed this bar by wide margins even when evidence/arguments were relatively scarce and it was (and still is!) really hard to come up with reliable specific estimates.
Explicating your reasoning in such ways is really important for making your analysis more legible/dissectible for others and also for yourself: it is quite easy to think “X is going to happen/work” before laying out the key steps/arguments, but sometimes by explicating your reasoning you (or others) can identify flawed assumptions, outright contradictions, or at least key hinge points in models.
Systematic change can be hard to predict, but (I suspect that) arguably everything can be given some kind of probability estimate in theory, even if it’s a situation of pure uncertainty that leads to an estimate of 1/n for n possible outcomes (such as ~50% for coin flips). These don’t have to be good estimates, but they need to be explicit so that 1) other people can evaluate them, and 2) you can use such calculations in your model (since you can’t multiply a number by ”?%”).
One thing that might be useful in this situation is to establish some kind of “outside view” or reference class: how often do these kinds of social movements/reforms work, and how beneficial do they tend to be? Once you have a generic reference class, you can add in arguments to refine the generic class to better fit this specific case (i.e., a systemic change being pushed by people in EA).
On a separate, more specific, but less important note: I especially take issue with the idea of “luck” being factored into the model. I suspect you don’t actually mean “luck” in the more superstitious sense (i.e., a cosmic quality that someone has or doesn’t have), but it’s exactly this kind of question/uncertainty (e.g., the likelihood that the environment will be favorable or that people will be in the right place at the right time) that needs to be made more explicit.
I think it’s fair to contend that EA is “biased” towards high-legibility, quantitative outcomes, but I don’t think that it is very bad on balance. Importantly, EA is fairly open to attempted quantification of normally-hard-to-quantify concepts, but it requires putting in some mental legwork and making plausible quantifications/models (even loose ones) of how valuable this idea is.
If you could describe in a step-by-step (e.g., probability X impact) manner a variety of plausible pathways or arguments by which this approach could have really high expected value, I would be interested to see such a model. For example, if you can say “I believe there is a P% chance that spending $R would lead to X outcome(s), which has an F% of producing U QALYs/[or other metric] relative to the counterfactual, creating an expected value of Z QALYs/benefits per dollar spent,” where Z seems like a plausible number (based on the plausibility of the rest of the model), then it probably isn’t something that EA will just dismiss. This is heavily related to Open Philanthropy’s post about reasoning transparency, which (among other benefits) makes it easier for someone else to dissect another person’s claims.
I do think it may be difficult to show really high yet plausible expected value for this intervention, but I’d still be open to seeing such an analysis, and personally I think that should probably be an initial, unrequested step before complaining about EA not being willing to consider hard-to-quantify ideas.
One quick note about that new introduction article:
The article says “From 2011-2022, about 260,000 people were killed by terrorism globally, compared to over 20 million killed by COVID-19.”
However, the footnote provided seemingly cuts off terrorism deaths data at 2020: “258,350 people were killed by terrorism globally between 2011 and 2020 inclusive.”
In my view, this isn’t substantive, but it seems worth trying to fix if this is “the” introduction article.
I felt uncomfortable about just taking 30 seconds to choose Will, and did end up spending more time reading some of the other bios, but in the end this seems like a rather silly popularity contest, especially given that it basically doesn’t even describe how you should vote (e.g., “who do you think is the most beneficial ‘thinker’?”). Of course, the fact that it is silly and ought to not be given attention by the public does not mean the public will ignore it, so yeah, I think Will is probably a good choice, and I don’t really have any substantive qualms with voting for him without having read many other profiles.
Once again, I’ll say that a study which analyzed the persuasion psychology/sociology of “x-risk from AI” (e.g., what lines of argument are most persuasive to what audiences, what’s the “minimal distance / max speed” people are willing to go from “what is AI risk” to “AI risk is persuasive,” how important is expert statements vs. theoretical arguments, what is the role of fiction in magnifying or undermining AI x-risk fears) seems like it would be quite valuable.
Although I’ve never held important roles or tried to persuade important people, in my conversations with peers I have found it difficult to walk the line between “sounding obsessed with AI x-risk” and “under emphasizing the risk,” because I just don’t have a good sense of how fast I can go from someone being unsure of whether AGI/superintelligence is even possible to “AI x-risk is >10% this century.”
I have been suggesting this (and other uses of Kialo) for a while, although perhaps not as frequently or forcefully as I ought to… I( would recommend linking to the site, btw)
Could you describe how familiar you are with AI/ML in general?
However, supposing the answer is “very little,” then the first simplified point I’ll highlight is that ML language models already seek goals, at least during training: the neural networks adjust to perform better at the language task they’ve been given (to put it simplistically).
If your question is “how do they start to take actions aside from ‘output optimal text prediction’”, then the answer is more complicated.
As a starting point for further research, have you watched Rob Miles videos about AI and/or read Superintelligence, Human Compatible, or The Alignment Problem?
Looking back at my comment, I probably came off much more pessimistic/critical than I intended to, especially since I wasn’t trying to evaluate both the positives and negatives of your post (and since it was your first post); I simply wanted to inject a few lines of thought that most shifted my thinking on this topic a while ago.
Moving forward, I wouldn’t want you to be overly cautious/slow in writing on the forum; definitely don’t take my thoughts as condemnation of your writing!
Because if a given person didn’t solve the problem in year X, someone else may have solved the problem just a few years (maybe decades?) later.
Many people have already made comments, but I’ll throw my 2 cents into the ring:
I don’t come from an Ivy League school or related background, nor do I have a fancy STEM degree, but feel decently at home in the EA community.
I often thought that my value was determined by how “well” I could do my job in my intended field (policy research) vs. whomever I’m replacing. However, one of the great insights of EA is that you don’t have to be the 99th percentile in your field in order to be impactful: in some fields (e.g., policy research arguably) the most important consideration is “how much more do you focus on important topics—especially x-risk reduction—vs. whomever you’re replacing,” given how little incentive/emphasis our society places on certain moral patients/outcomes (e.g., poor people in a foreign country, animals, future sentient beings).
There very likely is something at least moderately high-impact that you one can contribute to even if they are not traditionally very “smart,” especially now in the longtermist/x-risk reduction (and animal welfare) space. (I personally don’t know as much about other fields, but suspect that similar points apply.)
This is a bit of an amorphous question with tons of possible answers in some sense—e.g., the development of the scientific method—but actually identifying counterfactual benefits is tricky. The examples you listed probably don’t have major counterfactual benefits long into the future.
A semi-quick chain of thoughts (caveats/nuance dial set to low):
I’ve frequently thought that the variable you mention (e.g., “we have a *funding * overhang” vs. “*talent * underhang”) should be whatever you want the emphasis to be on in the immediate conversation: if your point is “we need more talent/ideas” then make the focus on that, then if it’s relevant you can mention “the limiting factor here is not money.”
However, sometimes your point may be “we are short on lots of different things relative to money (e.g., talent, information, ideas, connections),” in which case it might it’s seemed to me like it may be simpler to just say “funding overhang”—especially if the point/question is “ how can we use funds to reduce shortfalls in other enablers?”
Still, I’ll grant that “overhang” might be misleading or have negative connotations, making it worthwhile to use different language even in situations like (2). But I don’t know! Personally, I never thought the term was that big of a deal.
(Apologies if I’m just missing the point somewhere, I didn’t spend much time worrying about this possibility)
Could you provide a tl;dr section or other form of summary up front?
I’m definitely not the best person to give feedback on this, but I’ll just briefly share a few thoughts:
I’ve heard that EA grant makers often have relatively little time to review grant applications. This may or may not be true for EAIF, but supposing it is, even yellow flags like that offset article might cause a reviewer to quickly become pessimistic about providing grants (for some of the following reasons).
I would have recommended not using the example of rape; murder offsets probably would have been a better alternative. I only skimmed the post, but it really didn’t help that towards the beginning you make the seemingly-intentionally-controversially phrased point about “[Sometimes rape is permissible… you probably agree deep down. That is, if it is to prevent more rape.]” This ordering (saying “you probably agree” before clarifying “if it were in some twisted trolley problem scenario”) and phrasing (e.g., “deep down…”) is needlessly controversy-inviting. To be honest, to me these kinds of details genuinely do reflect some lack of perspective/room-reading or rhetorical finesse, regardless of whether you ultimately oppose the idea of rape offsets. (It also very much gives me flashbacks to the infamous Robin Hanson post, which really hurt his reputation and reflected a similar lack of perspective…) This may not be such a problem if I am personally evaluating your character, but:
Grant making probably is justified for being cautious about downside risks, including when it comes to optics risks. “EA grant makers fund writer of blog that callously discusses ‘rape offsets’” might be a very unfair social media characterization, but fairness doesn’t really matter, and I can’t be confident it won’t get pulled into some broader narrative attacking—however fairly or unfairly—EA overall. (Speaking as someone who’s never analyzed grant applications) I suspect you would have to have a really good case for potential upside to make it worth spending a few extra hours analyzing those optics risks, and in the end there may (or may not?) be plenty of other people to fund instead.
As for your overall blog, I haven’t read it, but I wouldn’t be surprised if it is otherwise good, and I’m glad to see a blog discussing moral issues. But rape is a topic that needs to be treated with a lot of care and caution, and probably should be avoided when it is just being used to make a point separate from rape.
Re your point 4-1: I wrote a relevant post some number of months ago and never really got a great answer: https://forum.effectivealtruism.org/posts/HZacQkvLLeLKT3a6j/how-might-a-herd-of-interns-help-with-ai-or-biosecurity
And now, here I am going into what may be my ~6th trimester of “not having an existential risk reduction (or relevant) job or internship despite wanting to get one”… 🙃
I’m a bit unsure of how/where to engage with the overall framework as a “framework,” since I’m unclear a bit on its falsifiable claims about the world or disputable recommendations, but I think I would push back against the idea of trying to simply sponge up knowledge in most contexts: I think that is a very science-y mindset, which certainly isn’t bad, but the vast majority of knowledge is extrinsically valuable. Having a mindset which treats much knowledge as intrinsically valuable might be helpful for motivational and other purposes, but it comes with drawbacks: you have to prioritize the knowledge you focus on, and doing that probably requires integrating your framework here into a broader, non-linear cycle of “collect information, identify options, weigh tradeoffs/goals, take action, collect information, …”
So, perhaps I don’t necessarily disagree with the framework—and I recognize that you might already understand what I’m about to say—but I think that trying to use/evaluate this framework outside of a broader decision-making/goal-pursuing process might not be highly effective. Still, as my own posts have suggested, I definitely don’t oppose collecting, organizing, and sharing knowledge.
Glad to see someone already wrote out some of my thoughts. To just tag on, some of my key bullet points for understanding Pascalian wager problems are:
• You can have offsetting uncertainties and consequences (as you mention), and thus you should fight expected value fire with EV fire.
• Anti-Pascalian heuristics are not meant to directly maximize the accuracy of your beliefs, but rather to improve the effectiveness of your overall decision-making in light of constraints on your time/cognitive resources. If we had infinite time to evaluate everything—even possibilities that seem like red herrings—it would probably usually be optimal to do so, but we don’t have infinite time so we have to make decisions as to what to spend our time analyzing and what to accept as “best-guesstimates” for particularly fuzzy questions. Thus, you can “fight EV fire with EV fire” at the level of “should I even continue entertaining this idea?”
• Very low probabilities (risk estimates) tend to be associated with greater uncertainty, especially when the estimates aren’t based on clear empirical data. As a result, really low probability estimates like “1/100,000,000” tend to be more fragile to further analysis, which crucially plays into the next bullet point.
• Sometimes the problem with Pascalian situations (especially in some high school policy debate rounds I’ve seen) is that someone fails to update based on the velocity/acceleration of their past updates: suppose one person presents an argument saying “this very high impact outcome is 1% likely.” The other person spends a minute arguing that it’s not 1% likely, and it actually only seems to be 0.1% likely. They spend another minute disputing it and it then seems to be only 0.01% likely. They then say “I have 5 other similar-quality arguments I could give, but I don’t have time.” The person that originally presented the argument could then say “Ha! I can’t dispute their arguments, but even if it’s 0.01% likely, the expected value of this outcome still is large” … the other person gives a random one of their 5 arguments and drops the likelihood by another order of magnitude, etc. The point being, given the constraints on information flow/processing speed and available time in discourse, one should occasionally take into account how fast they are updating and infer the “actual probability estimate I would probably settle on if we had a substantially greater amount of time to explore this.” (Then fight EV fire with EV fire)