Feedback from where?

November 2022 update: I wrote this post during a difficult period in my life. I still agree with the basic point I was gesturing towards, but regret some of the presentation decisions I made. I may make another attempt in the future.


“A system that ignores feedback has already begun the process of terminal instability.”

– John Gall, Systemantics

(My request from last time still stands.)

jimrandomh wrote a great comment in response to my last post:

The core thesis here seems to be:

“I claim that [cluster of organizations] have collectively decided that they do not need to participate in tight feedback loops with reality in order to have a huge, positive impact.”

There are different ways of unpacking this, so before I respond I want to disambiguate them. Here are four different unpackings:

  1. Tight feedback loops are important, [cluster of organizations] could be doing a better job creating them, and this is a priority. (I agree with this. Reality doesn’t grade on a curve.)

  2. Tight feedback loops are important, and [cluster of organizations] is doing a bad job of creating them, relative to organizations in the same reference class. (I disagree with this. If graded on a curve, we’re doing pretty well. )

  3. Tight feedback loops are important, but [cluster of organizations] has concluded in their explicit verbal reasoning that they aren’t important. (I am very confident that this is false for at least some of the organizations named, where I have visibility into the thinking of decision makers involved.)

  4. Tight feedback loops are important, but [cluster of organizations] is implicitly deprioritizing and avoiding them, by ignoring/​forgetting discouraging information, and by incentivizing positive narratives over truthful narratives.

(4) is the interesting version of this claim, and I think there’s some truth to it. I also think that this problem is much more widespread than just our own community, and fixing it is likely one of the core bottlenecks for civilization as a whole.

I think part of the problem is that people get triggered into defensiveness; when they mentally simulate (or emotionally half-simulate) setting up a feedback mechanism, if that feedback mechanism tells them they’re doing the wrong thing, their anticipations put a lot of weight on the possibility that they’ll be shamed and punished, and not much weight on the possibility that they’ll be able to switch to something else that works better. I think these anticipations are mostly wrong; in my anecdotal observation, the actual reaction organizations get to poor results followed by a pivot is usually at least positive about the pivot, at least from the people who matter. But getting people who’ve internalized a prediction of doom and shame to surface those models, and do things that would make the outcome legible, is very hard.

...

I replied:

Thank you for this thoughtful reply! I appreciate it, and the disambiguation is helpful. (I would personally like to do as much thinking-in-public about this stuff as seems feasible.)

I mean a combination of (1) and (4).

I used to not believe that (4) was a thing, but then I started to notice (usually unconscious) patterns of (4) behavior arising in me, and as I investigated further I kept noticing more & more (4) behavior in me, so now I think it’s really a thing (because I don’t believe that I’m an outlier in this regard).

...

I agree with jimrandomh that (4) is the most interesting version of this claim. What would it look like if the cluster of EA & Rationality organizations I pointed to last time were implicitly deprioritizing getting feedback from reality?

I don’t have a crisp articulation of this yet, so here are some examples that seem to me to gesture in that direction:

Please don’t misunderstand – I’m not suggesting that the people involved in these examples are doing anything wrong. I don’t think that they are behaving malevolently. The situation seems to me to be more systemic: capable, well-intentioned people begin participating in an equilibrium wherein the incentives of the system encourage drift away from reality.

There are a lot of feedback loops in the examples I list above… but those loops don’t seem to connect back to reality, to the actual situation on the ground. Instead, they seem to spiral upwards – metrics tracking opinions, metrics tracking the decisions & beliefs of other people in the community. Goodhart’s Law neatly sums up the problem.

Why does this happen? Why do capable, well-intentioned people get sucked into equilibria that are deeply, obviously strange?

Let’s revisit this part of jimrandomh’s great comment:

I think part of the problem is that people get triggered into defensiveness; when they mentally simulate (or emotionally half-simulate) setting up a feedback mechanism, if that feedback mechanism tells them they’re doing the wrong thing, their anticipations put a lot of weight on the possibility that they’ll be shamed and punished, and not much weight on the possibility that they’ll be able to switch to something else that works better. I think these anticipations are mostly wrong; in my anecdotal observation, the actual reaction organizations get to poor results followed by a pivot is usually at least positive about the pivot, at least from the people who matter. But getting people who’ve internalized a prediction of doom and shame to surface those models, and do things that would make the outcome legible, is very hard.

I don’t have a full articulation yet, but I think this starts to get at it. The strange equilibria fulfill a real emotional need for the people who are attracted to them (see Core Transformation for discussion of one approach towards developing an alternative basis for meeting this need).

And from within an equilibrium like this, pointing out the dynamics by which it maintains homeostasis is often perceived as an attack...