Researcher at the Center on Long-Term Risk. All opinions my own.
Anthony DiGiovanni đ¸
But we also have strong reasons to trust that some process designed our cooperative instincts to allow groups of humans to cooperate effectively.
â[A]llowing [small] groups of humans to cooperate effectivelyâ is very far from âmaking the far future better, impartially speakingâ. Iâd be interested in your responses to the arguments here.
I also think that many individuals need to decide how to make their lives go well in pretty confusing circumstances. Imagine deciding whether to immigrate to America in the 1700s, or how to live in the shadow of the Cold War, or whether to genetically engineer your children.
First, itâs not clear to me these people werenât clueless â i.e. really had more reason to choose whatever they chose than the alternatives â depending on how long a time horizon they were aiming to make go well.
Second, insofar as we think these peopleâs choices were justified, I donât see why you think their instincts gave them such justification. Why would these instincts track unprecedented consequences so well?
and the net value of the things that itâs done so far may well be dominated by what updates it makes based on that experience
I donât think âmay wellâ gets us very far. Can you say more why this hypothesis is so much more likely than, say, âthe dominant impacts are the damage thatâs already been doneâ, or âthe dominant impacts will come from near-future decisions, made by actors who are still too ignorant about the extremely complex system theyâre intervening inâ?
do you think that, if we had a theory of sociopolitics that was about as good as 20th-century economics, then we wouldnât be clueless about how to do sociopolitical interventions (like founding AI safety movements) effectively?
No, because I think âfounding AI safety movements that succeed at making the far future go betterâ is a pretty out-of-distribution kind of sociopolitical intervention.
Thanks for explaining! To summarize, I think there are crucial disanalogies between the âmake their own life go wellâ case and the âmake the future of humanity go wellâ case:
Should they follow those instincts even when they donât know why evolution instilled them? ⌠(Note that this doesnât rely on them understanding evolutionâeven 1000 years ago they could have trusted that some process designed their body and mind to function well, despite radical uncertainty about what that process was.)
In this case, the reasons to trust âthat some process designed their body and mind to function wellâ are relatively strong. Because of how weâre defining âwellâ: an individualâs survival in a not-too-extreme environment. Even if they donât understand evolution, they can think about how their instincts plausibly wouldâve been honed on feedback relevant to this objective. And/âor look at how other individuals (or their past self) have tended to survive when they trusted these kinds of instincts.[1]
Now consider someone who wants to make the future of humanity go well. Similar, they have certain cooperative instincts ingrained in themâe.g. the instinct towards honesty. All else equal, it seems pretty reasonable to think that following them will help humanity to cooperate better, and that this will allow humanity to avoid internal conflict.
Here, the reasons to trust that the instincts track the objective seem way weaker to me,[2] for all the reasons I discuss here: No feedback loops, radically unfamiliar circumstances due to the advent of ASI and the like, a track record of sign-flipping considerations. All else is really not equal.
You yourself have written about how the AIS movement arguably backfired hard, which I really appreciate! I expect that ex ante, people founding this movement told themselves: âAll else equal, it seems pretty reasonable to think that trying to warn people of a source of x-risk, and encouraging research on how to prevent it, will help humanity avoid that x-risk.â
(I think analogous problems apply to your subagent and forward-chaining framings. Theyâre justified when the larger system provides feedback, or the forward steps have been validated in similar contexts â which weâre missing here.)
How does this relate to cluelessness? Mostly I donât really know what the term means
The way I use the term, youâre clueless about how to compare A vs. B, relative to your values, if: It seems arbitrary (upon reflection) to say A is better than, worse than, or exactly as good as B. And instead it seems we should consider Aâs and Bâs goodness incomparable.
- ^
What if someone has always been totally solitary, doesnât understand evolution or feedback loops, and hasnât made many decisions based on similar instincts? Seems like such a person wouldnât have reasons to trust their instincts! Theyâd just be getting lucky.
- ^
See here for my reply to: âSure, âway weakerâ, but theyâre still slightly better than chance right?â Tl;dr: This doesnât work because the problem isnât just noise that weakens the signal, itâs âitâs ambiguous what the direction of the signal even isâ.
- ^
Thanks â Iâve read both but neither seems to answer my objection.
The stuff on cluelessness feels like itâs conceding a little too much to the EA/âbayesian frame. Itâs implying that you should have a model of the entire future in order to make decisions. But what I think you actually want to claim is that itâs sensible and even ârationalâ to make non-model-based decisions (e.g. via heuristics, intuitions, etc).
Iâd be interested in hearing more on what exactly you mean by this. Insofar as someone wants to make decisions based on impartially altruistic values, I think cluelessness is their problem, even if they donât make decisions by explicitly optimizing w.r.t. a model of the entire future. If such a person appeals to some heuristics or intuitions as justification for their decisions, then (as argued here) they need to say why those heuristics or intuitions reliably track impact on the impartial good. And the case for that looks pretty dubious to me.
(If youâre rejecting the âmake decisions based on impartially altruistic valuesâ step, fair enough, though I think weâd do well to be explicit about that.)
If you had more evidence, you could make the comparison. But you currently have no clue which direction the comparison would go, in expectation over the evidence you might receive. So how are you supposed to compare them right now?
My best guess about which of 2 identical objects has a larger mass in expectation will be arbitrary is their mass only differs by 10^-6 kg, and I have no way of assessing this small difference. However, this does not mean the expected mass of the 2 objects is fundamentally incomparable
I worry youâre reifying âexpectationsâ as something objective here. The relative actual masses of the objects are clearly comparable. But if you subjectively canât compare them, then theyâre indeed incomparable âin expectationâ in the relevant sense.
Besides the links Michael shared, I highly recommend this really short post.
However, the same goes for comparisons among the expected mass of seemingly identical objects with a similar mass if I can only assess their mass using my hands, but this does not mean their mass is incomparable.
I donât exactly understand what argument youâre making here.
My core argument in the post is: Take any intervention X. We want to weigh up its impact for all sentient beings across the cosmos, where this âweighing upâ is aggregation over all hypotheses. Now suppose we want to force ourselves to compare X with inaction, i.e., say either UEV(do X) > UEV(donât do X) or vice versa. We have such an extremely coarse-grained understanding (if any) of these hypotheses[1] that, when we do the weighing-up, whether we say UEV(do X) > UEV(donât do X) or vice versa seems to depend on an arbitrary choice.
Can you say how your argument relates to mine?
- ^
Relative to the amount of fine-grained detail necessary to evaluate the hypothesis, when what we value is âwell-being of all sentient beings across the cosmosâ.
- ^
In normal situations, an agent can rationally come to a single probability distribution, but, Greaves argues that, in a situation with complex cluelessness, an individual should instead have a set of probability functions that they are ârationally required to remain neutral between.â Iâm not entirely sure what this means.
You might be interested in this post I wrote explaining imprecision â hopefully answers âwhat this meansâ.
Iâm still not sure I understand why you find the arguments in the linked post, and post #3 of the sequence, uncompelling. Can you say more on that?
Given that the intervals are both derived from a representor P, the interval of EV diffs is {EV_p(A) - EV_p(B) | p in P}. See also here.
Ah I missed the â2 states of the world which are exactly the sameâ part, sorry. Then yeah the EVs would be the same. Iâm not sure how this is supposed to support your original commentâs argument though.
Depends on the details of what the intervals are supposed to represent. E.g.:
Say you have a representor (imprecise probabilities) where EV_P(A) = EV_P(B) = [-1, 1].
On one hand:
If:
for p1 in P, EV_p1(A) = â1 while EV_p1(B) = 1, and
for p2 in P, EV_p2(A) = 1 while EV_p2(B) = â1,
then A and B are incomparable.
OTOH:
If for all p in P, EV_p(A) = EV_p(B), then A and B are comparable.
(Ofc there are lots of other cases.)
How to not do deÂciÂsion theÂory backwards
When do inÂtuÂitions need to be reÂliÂable?
Hi Vasco. I think Figure 3 here, and the surrounding discussion of how imprecision works, might answer your objection.
The idea is:
Suppose two actions have precise EVs. Youâll presumably grant that a tiny change in the (expected) location of electrons can flip the difference in EV from +epsilon to -epsilon.
If so, then a tiny change in the (expected) location of electrons can flip the lower bound of an imprecise difference in EV from +epsilon to -epsilon.
What makes two actions incomparable, under the imprecise EV model, is that the interval of EV differences crosses zero.
So, itâs unsurprising that a tiny change in the (expected) location of electrons can flip the two actions from âcomparableâ to âincomparableâ.
Can you say which step in this argument you reject, and why?
especially the observation that successful prediction systems across most domains use cluster not sequence thinking.
I find this âobservationâ confusing /â misleading, given that Holden defines cluster thinking as aggregating decisions from multiple perspectives. This is very different from aggregating the predictions of multiple models. The evidence of âsuccessâ he cites only applies to the latter (where âsuccessâ is with respect to Brier scores and such), not the former.
And this is practically relevant: If you aggregate multiple models but then maximize EV under the aggregated model, you donât get the âsandboxingâ property Holden claims cluster thinking satisfies. The fanatical/âPascalian model will still dominate the EV calculation.
(ETA: As an aside on sequence thinking /â cluster thinking generally, I wish these discussions made it very clear whether weâre taking ST/âCT as (1) different normative standards for good epistemology /â decision-making per se, vs. as (2) different procedures for satisfying a given epistemological /â decision-theoretic standard. Cf. âcriterion of rightness vs. decision procedureâ in ethics. This would be helpful for clarifying whatâs meant by claims like âcluster thinking is how âsuccessfulâ prediction systems operateâ. Iâve been assuming (2), here, FWIW.)
I think if youâre savvy you will probably find a way to make the astronomical thing go betterâsuch as doing strategy/âprioritization/âdeconfusion work, or working on robustly good intermediate desiderata, or building skills/âmoney in case thereâs more clarity in the future
What do you think about the arguments for cluelessness from imprecision, e.g., here? (I explain more why I think weâre clueless even about the things you list, here.)
Thanks for this! For what itâs worth, some issues Iâve found with the âCRIBSâ and âEA Epistemic Auditorâ reviews for drafts of philosophical blog posts:
excessively allergic to âhedgingâ, and to sections of posts meant to preempt very important misreadings
flagging some points as âhidden assumptionsâ even when theyâre explicitly addressed in the post, or seem clearly irrelevant to the argument
critiquing claim X as not empirically supported, when X is the claim âY isnât empirically supportedâ.
But theyâre somewhat useful for surfacing what kinds of misunderstandings readers might have.
I donât know what you mean by âpractically the sameâ, can you say more?
Regardless, the problem is that âgathering evidenceâ vs âdoing something elseâ is itself a decision, whose consequences youâll be clueless about. I discuss this more here.