Researcher at the Center on Long-Term Risk. All opinions my own.
Anthony DiGiovanni
FWIW, while I didn’t downvote the comment, I can see how folks would consider “Why stop at X?” a lazy “gotcha” argument or appeal to absurdity heuristic, which seems worth discouraging.
I wrote an article recently about insects.
FYI, I assume the link here doesn’t go to the post you intended.
I partly had in mind personal communications, but some public examples (and very brief summaries of my reactions, not fleshed out counterarguments):
In “Sequence thinking vs. cluster thinking”, Holden says, “For example, obeying common-sense morality (“ends don’t justify the means”) heuristics seems often to lead to unexpected good outcomes, and contradicting such morality seems often to lead to unexpected bad outcomes.”
I guess the argument is supposed to be that we have empirical evidence of heuristics working well in this sense. But on its face, this just pushes the question back to why we should expect “how well a strategy works under unknown unknowns” to generalize so cleanly from local scales to longtermist scales. (Related discussion here.)
“Heuristics for clueless agents” claims that “heuristics produce effective decisions without demanding too much of ordinary decision-makers.”
Their arguments seem to be some combination of “in some decision situations, it’s pretheoretically clear which decision procedures are more or less ‘effective’” (Sec. 5) and “heuristics have theoretical justification based on the bias-variance tradeoff” (Sec. 7). But pretheoretic judgments about effectiveness from a longtermist perspective seem extremely unreliable, and appeals to bias-variance tradeoffs are irrelevant when the problem (under UUs) is model misspecification.
Why expect “heuristics” to be robust to unknown unknowns?
I often read/hear claims that, if we’re worried that our evaluations of interventions won’t hold up under unknown unknowns, we should follow (simple) heuristics. But what precisely is the argument for this? This isn’t a rhetorical question — I’m just noting my confusion and want to understand this view better.
Interested to hear more from those who endorse this view!
I worry we’re going to continue to talk past each other. So I don’t plan to engage further. But for other readers’ sake:
I definitely don’t treat broad imprecision as “a privileged default”. In the post I explain the motivation for having more or less severely imprecise credences in different hypotheses. The heart of it is that adding more precision, beyond what the evidence and plausible foundational principles merit, seems arbitrary. And you haven’t explained why your bottom-line intuition — about which decisions are good w.r.t. a moral standard as extremely far-reaching as impartial beneficence[1] — would constitute evidence or a plausible foundational principle. (To me this seems pretty clearly different from the kind of intuition that would justify rejecting radical skepticism.)
I don’t see how this engages with the arguments I cited, or the cited post more generally. Why do you think it’s plausible to form a (non-arbitrary) determinate judgment about these matters? Why think these determinate judgments are our “best” judgment, when we could instead have imprecise credences that don’t narrow things down beyond what we have reason to?
I don’t think this response engages with the argument that judgment calls about our impact on net welfare over the whole cosmos are extraordinary claims, so they should be held to a high epistemic standard. What do you think of my points on this here and in this thread?
I think this is the most honest answer, from an impartial altruistic perspective.
I’ve got a big moral circle (all sentient beings and their descendants), but it does not extend to aliens because of cluelessness.
...
I’m quite confident that if we’re thinking about the moral utility of spacefaring civilisation, we should at least limit our scope to our own civilisation
I agree that the particular guesses we make about aliens will be very speculative/arbitrary. But “we shouldn’t take the action recommended by our precise ‘best guess’ about XYZ” does not imply “we can set the expected contribution of XYZ to the value of our interventions to 0″. I think if you buy cluelessness — in particular, the indeterminate beliefs framing on cluelessness — the lesson you should take from Maxime’s post is that we simply aren’t justified in saying any intervention with effects on x-risk is net-positive or net-negative (w.r.t. total welfare of sentient beings).
This is linked to my discussion with Jim about determinate credences (since I didn’t initially understand this concept well, ChatGPT gave me a useful explanation).
FYI, I don’t think ChatGPT’s answer here is accurate. I’d recommend this post if you’re interested in (in)determinate credences.
To be clear, “preferential gap” in the linked article just means incomplete preferences. The property in question is insensitivity to mild sweetening.
If one was exactly indifferent between 2 outcomes, I believe any improvement/worsening of one of them must make one prefer one of the outcomes over the other
But that’s exactly the point — incompleteness is not equivalent to indifference, because when you have an incomplete preference between 2 outcomes it’s not the case that a mild improvement/worsening makes you have a strict preference. I don’t understand what you think doesn’t “make sense in principle” about insensitivity to mild sweetening.
I fully endorse expectational total hedonistic utilitarianism (ETHU) in principle
As in you’re 100% certain, and wouldn’t put weight on other considerations even as a tiebreaker? That seems extreme. (If, say, you became convinced all your options were incomparable from an ETHU perspective because of cluelessness, you would presumably still all-things-considered-prefer not to do something that injures yourself for no reason.)
Thanks! I’ll just respond re: completeness for now.
When we ask “why should we maximize EV,” we’re interested in the reasons for our choices. Recognizing that I’m forced by reality to either donate or not-donate doesn’t help me answer whether it’s rational to strictly prefer donating, strictly prefer not-donating, be precisely indifferent, or none of the above.
Incomplete preferences have at least one qualitatively different property from complete ones, described here, and reality doesn’t force you to violate this property.
Not that you’re claiming this directly, but just to flag, because in my experience people often conflate these things: Even if in some sense your all-things-considered preferences need to be complete, this doesn’t mean your preferences w.r.t. your first-order axiology need to be complete. For example, take the donation case. You might be very sympathetic to a total utilitarian axiology, but when deciding whether to donate, your evaluation of the total utilitarian betterness-under-uncertainty of one option vs. another doesn’t need to be complete. You might, say, just rule out options that are stochastically dominated w.r.t. total utility, and then decide among the remaining options based on non-consequentialist considerations. (More on this idea here.)
Why do you consider completeness self-evident? (Or continuity, although I’m more sympathetic to that one.)
Also, it’s important not to conflate “given these axioms, your preferences can be represented as maximizing expected utility w.r.t. some utility function” with “given these axioms [and a precise probability distribution representing your beliefs], you ought to make decisions by maximizing expected value, where ‘value’ is given by the axiology you actually endorse.” I’d recommend this paper on the topic (especially Sec. 4), and Sec. 2.2 here.
I mean, it seems to me like a striking “throw a ball in the air and have it land and balance perfectly on a needle” kind of coincidence to end at exactly — or indistinguishably close to — 50⁄50 (or at any other position of complete agnosticism, e.g. even if one rejects precise credences).
I don’t see how this critique applies to imprecise credences. Imprecise credences by definition don’t say “exactly 50⁄50.”
Up until the last paragraph, I very much found myself nodding along with this. It’s a nice summary of the kinds of reasons I’m puzzled by the theory of change of most digital sentience advocacy.
But in your conclusion, I worry there’s a bit of conflation between 1) pausing creation of artificial minds, full stop, and 2) pausing creation of more advanced AI systems. My understanding is that Pause AI is only realistically aiming for (2) — is that right? I’m happy to grant for the sake of argument that it’s feasible to get labs and governments to coordinate on not advancing the AI frontier. It seems much, much harder to get coordination on reducing the rate of production of artificial minds. For all we know, if weaker AIs suffer to a nontrivial degree, the pause could backfire because people would just use many more instances of these AIs to do the same tasks they would’ve otherwise done with a larger model. (An artificial sentience “small animal replacement problem”?)
I can accept the idea of X as an agent making decisions, and ask what those decisions are and what drives them, without implicitly accepting the idea that X has beliefs. Then “X has beliefs” is kind of a useful model for predicting their behaviour in the decision situations.
I think this is answering a different question, though. When talking about rationality and cause prioritization, what we want to know is what we ought to do, not how to describe our patterns of behavior after the fact. And when asking what we ought to do under uncertainty, I don’t see how we escape the question of what beliefs we’re justified in. E.g. betting on short AI timelines by opting out of your pension is only rational insofar as it’s rational to (read: you have good reasons to) believe in short timelines.
from my perspective the question of whether credences are ultimately indeterminate is … not so interesting? It’s enough that in practice a lot of credences will be indeterminate, and that in many cases it may be useful to invest time thinking to shrink our uncertainty, but in many other cases it won’t be
I’m not sure what you’re getting at here. My substantive claim is that in some cases, our credences about features of the far future might be sufficiently indeterminate that overall we won’t be able to determinately say “X is net-good for the far future in expectation.” If you agree with that, that seems to have serious implications that the EA community isn’t pricing in yet. If you don’t agree with that, I’m not sure if it’s because of (1) thorny empirical disagreements over the details of what our credences should be, or (2) something more fundamental about epistemology (which is the level at which I thought we were having this discussion, so far). I think getting into (1) in this thread would be a bit of a rabbit hole (which is better left to some forthcoming posts I’m coauthoring), though I’d be happy to give some quick intuition pumps. Greaves here (the “Suppose that’s my personal uber-analysis...” paragraph) is a pretty good starting point.
I’ll just reply (for now) to a couple of parts
No worries! Relatedly, I’m hoping to get out a post explaining (part of) the case for indeterminacy in the not-too-distant future, so to some extent I’ll punt to that for more details.
without having such an account it’s sort of hard to assess how much of our caring for non-hedonist goods is grounded in themselves, vs in some sense being debunked by the explanation that they are instrumentally good to care about on hedonist grounds
Cool, that makes sense. I’m all for debunking explanations in principle. Extremely briefly, here’s why I think there’s something qualitative that determinate credences fail to capture: If evidence, trustworthy intuitions, and appealing norms like the principle of indifference or Occam’s razor don’t uniquely pin down an answer to “how likely should I consider outcome X?”, then I think I shouldn’t pin down an answer. Instead I should suspend judgment, and say that there aren’t enough constraints to give an answer that isn’t arbitrary. (This runs deeper than “wait to learn / think more”! Because I find suspending judgment appropriate even in cases where my uncertainty is resilient. Contra Greg Lewis here.)
Is it some analogue of betting odds? Or what?
No, I see credences as representing the degree to which I anticipate some (hypothetical) experiences, or the weight I put on a hypothesis / how reasonable I find it. IMO the betting odds framing gets things backwards. Bets are decisions, which are made rational by whether the beliefs they’re justified by are rational. I’m not sure what would justify the betting odds otherwise.
how you’d be inclined to think about indeterminate credences in an example like the digits of pi case
Ah, I should have made clear, I wouldn’t say indeterminate credences are necessary in the pi case, as written. Because I think it’s plausible I should apply the principle of indifference here: I know nothing about digits of pi beyond the first 10, except that pi is irrational and I know irrational numbers’ digits are wacky. I have no particular reason to think one digit is more or less likely than another, so, since there’s a unique way of splitting my credence impartially across the possibilities, I end up with 50:50.[1]
Instead, here’s a really contrived variant of the pi case I had too much fun writing, analogous to a situation of complex cluelessness, where I’d think indeterminate credences are appropriate:
Suppose that Sally historically has an uncanny ability to guess the parity of digits of (conjectured-to-be) normal numbers with an accuracy of 70%. Somehow, it’s verifiable that she’s not cheating. No one quite knows how her guesses are so good.
Her accuracy varies with how happy she is at the time, though. She has an accuracy of ~95% when really ecstatic, ~50% when neutral, and only ~10% when really sad. Also, she’s never guessed parities of Nth digits for any N < 1 million.
Now, Sally also hasn’t seen the digits of pi beyond the first 10, and she guesses the 20th is odd. I don’t know how happy she is at the time, though I know she’s both gotten a well-earned promotion at her job and had an important flight canceled.
What should my credence in “the 20th digit is odd” be? Seems like there are various considerations floating around:
The principle of indifference seems like a fair baseline.
But there’s also Sally’s really impressive average track record on N ≥ 1 million.
But also I know nothing about what mechanism drives her intuition, so it’s pretty unclear if her intuition generalizes to such a small N.
And even setting that aside, since I don’t know how happy she is, should I just go with the base rate of 70%? Or should I apply the principle of indifference to the “happiness level” parameter, and assume she’s neutral (so 50%)?
But presumably the evidence about the promotion and canceled flight tell me something about her mood. I guess slightly less than neutral overall (but I have little clue how she personally would react to these two things)? How much less?
I really don’t know a privileged way to weigh all this up, especially since I’ve never thought about how much to defer to a digit-guessing magician before. It seems pretty defensible to have a range of credences between, say, 40% and 75%. These endpoints themselves are kinda arbitrary, but at least seem considerably less arbitrary than pinning down to one number.
I could try modeling all this and computing explicit priors and likelihood ratios, but it seems extremely doubtful there’s gonna be one privileged model and distribution over its parameters.
(I think forming beliefs about the long-term future is analogous in many ways to the above.)
Not sure how much that answers your question? Basically I ask myself what constraints the considerations ought to put on my degree of belief, and try not to needlessly get more precise than those constraints warrant.
- ^
I don’t think this is clearly the appropriate response. I think it’s kinda defensible to say, “This doesn’t seem like qualitatively the same kind of epistemic situation as guessing a coin flip. I have at least a rough mechanistic picture of how coin flips work physically, which seems symmetric in a way that warrants a determinate prediction of 50:50. But with digits of pi, there’s not so much a ‘symmetry’ as an absence of a determinate asymmetry.” But I don’t think you need to die on that hill to think indeterminacy is warranted in realistic cause prio situations.
Instead I’m saying that in many decision-situations people find themselves in, although they could (somewhat) narrow their credence range by investing more thought, in practice the returns from doing that thinking aren’t enough to justify it, so they shouldn’t do the thinking.
(I don’t think this is particularly important, you can feel free to prioritize my other comment.) Right, sorry, I understood that part. I was asking about an implication of this view. Suppose you have an intervention whose sign varies over the range of your indeterminate credences. Per the standard decision theory for indeterminate credences, then, you currently don’t have a reason to do the intervention — it’s not determinately better than inaction. (I’ll say more about this below, re: your digits of pi example.) So if by “the returns from doing that thinking aren’t enough to justify it” you mean you should just do the intervention in such a case, that doesn’t make sense to me.
Thanks! I unfortunately don’t have time to engage fully with this thread going forward, but briefly:
To be clear, I don’t share Karnofsky’s overall framework. I’m skeptical of the “regression to normality” criterion myself. (And I don’t find his model of the problem behind Pascal’s mugging probabilities compelling, since he still uses precise estimates.)
In the Pascal’s mugging case, I think people have some fuzzy sense that the mugger’s claim is made-up, which can be more carefully operationalized with imprecise credences. But if we can’t even point to what our “this is absurd” reaction is about, and are instead merely asserting that our pretheoretic sense should dictate our decisions, I’m more skeptical. Especially when we’re embracing an ethical principle most people would consider absurd (impartial altruism).