Jim Buhler comments on The Future Fund’s Project Ideas Competition

Jim Buhler Mar 21, 2022, 8:38 AM
1 point
0 ∶ 0
Thanks for the reply! :)

By “copies”, I meant “agents which action-correlate with you” (i.e., those which will cooperate if you cooperate), not “agents sharing your values”. Sorry for the confusion.

Do you think all agents thinking superrationaly action-correlate? This seems like a very strong claim to me. My impression is that the agents with a decision-algorithm similar enough to mine to (significantly) action-correlate with me is a very small subset of all superrationalists . As your post suggests, even your past-self doesn’t fully action-correlate with you (although you don’t need “full correlation” for cooperation to be worthwhile, of course).

In a one-shot prisoner’s dilemma, would you cooperate with anyone who agrees that superrationality is the way to go?

In his paper on ECL, Caspar Oesterheld says (section 2, p.9): “I will tend to make arguments from similarity of decision algorithms rather than from common rationality, because I hold these to be more rigorous and more applicable whenever there is not authority to tell my collaborators and me about our common rationality.”
However, he also often uses “the agents with a decision-algorithm similar enough to mine to (significantly) action-correlate with me” and “all superrationalists ” interchangeably, which confuses me a lot.
- Dawn Drescher Mar 21, 2022, 10:36 AM
  2 points
  0 ∶ 0
  Parent
  Do you think all agents thinking superrationaly action-correlate?
  Yes, but by implication not assumption. (Also no, not perfectly at least, because we’ll all always have some empirical uncertainty.)
  Superrationalists want to compromise with each other (if they have the right aggregative-consequentialist mindset), so they try to infer what everyone else wants (in some immediate, pre-superrationality sense), calculate the compromise that follows from that, determine what actions that compromise implies for the context in which they find themselves (resources and whatnot), and then act accordingly. These final acts can be very different depending on their contexts, but the compromise goals from which they follow correlate to the extent to which they were able to correctly infer what everyone wants (including bargaining solutions etc.).
  In a one-shot prisoner’s dilemma, would you cooperate with anyone who agrees that superrationality is the way to go?
  Yes.
  Hmm, it’s been a couple years since I read the paper, so not sure how that is meant… But I suppose either the decision algorithm is similar (1) because it goes through the superrationality step, or the decision algorithm has to be a bit similar (2) in order for people to consider superrationality in the first place. You need to subscribe to non-causal DTs or maybe have indexical uncertainty of some sort. It might be something that religious people and EAs come up with but that seems weird to most other people. (I think Calvinists have these EDT leanings, so maybe they’d embrace superrationality too? No idea.) I think superrationality breaks down in many earth-bound cases because too many people here would consider it weird, like the whole CDT crowd probably, unless they are aware of their indexical uncertainty, but that’s also still considered a bit weird.
  - Jim Buhler Mar 21, 2022, 3:43 PM
    1 point
    0 ∶ 0
    Parent
    Oh interesting! Ok so I guess there are two possibilities.
    
    1) Either by “supperrationalists”, you mean something stronger than “agents taking acausal dependences into account in PD-like situations”, which I thought was roughly Caspar’s definition in his paper. And then, I’d be even more confused.
    
    2) Or you really think that taking acausal dependences into account is, by itself, sufficient to create a significant correlation in two decision-algorithms. In that case, how do you explain that I would defect against you and exploit you in one-shot PD (very sorry, I just don’t believe we correlate ^^), despite being completely on board with supperrationality? How is that not a proof that common supperrationality is insufficient?
    
    (Btw, happy to jump on a call to talk about this if you’d prefer that over writing.)
    - Dawn Drescher Mar 21, 2022, 6:46 PM
      2 points
      0 ∶ 0
      Parent
      I think it’s closer to 2, and the clearer term to use is probably “superrational cooperator,” but I suppose that’s probably meant by “superrationalist”? Unclear. But “superrational cooperator” is clearer about (1) knowing about superrationality and (2) wanting to reap the gains from trade from superrationality. Condition 2 can be false because people use CDT or because they have very local or easily satisfied values and don’t care about distant or additional stuff.
      So just as in all the thought experiments where EDT gets richer than CDT, your own behavior is the only evidence you have about what others are likely to predict about you. The multiverse part probably smooths that out a bit, so your own behavior gives you evidence of increasing or decreasing gains from trade as the fraction of agents in the multiverse that you think cooperate with you increases or decreases.
      I think it would be “hard” to try to occupy that Goldilocks zone where you maximize the number of agents who wrongly believe that you’ll cooperate while you’re really defecting, because you’d have to simultaneously believe that you’re the sort of agent that cooperates despite actually defecting, which should give you evidence that you’re wrong about what reference class you’re likely to be put in. There may be agents like that out there, but even if that’s the case, they won’t have control over it. The way this will probably be factored in is that superrational cooperators will expect a slightly lower cooperation incidence to agents in reference classes of agents that are empirically very likely to cooperate while not being physically forced to cooperate because being in that reference class makes defection more profitable up to the point where it actually changes the assumptions others are likely to make about the reference class that have enabled the effect in the first place. That could mean that for any given reference class of agent who are able to defect, cooperation “densities” over 99% or so get rapidly less likely.
      But really, I think, the winning strategy for anyone at all interested in distant gains from trade is to be a very simple, clear kind of superrational cooperator agent because that maximizes the chances that others will cooperate with that sort of agent. All that “trying to be clever” and “being the sort of agent that tries to be clever” probably just costs so much gains from trade right away that you’d have to value the distant gains from trade very low compared to your local stuff for it to make any economic sense, and then you can probably forget about the gains from trade anyway because others will also predict that. I think David Althaus and Johannes Treutlein have thought about this from the perspective of different value systems, but I don’t know of any published artifacts from that.
      We can have a chat some time, gladly! But it’s been a while that I’ve done all this so I’m a bit slow. ^.^′