I think it’s an Eliezer-neologism that was meant to highlight an analogy between moral consequentialism (‘you should base actions on their consequences’) and the kind of reasoning he’s talking about (‘you do base actions on their consequences’).
There’s a long chain of causality whereby a male squirrel, eating a nut today, produces more offspring months later: Chewing and swallowing food, to digesting food, to burning some calories today and turning others into fat, to burning the fat through the winter, to surviving the winter, to mating with a female, to the sperm fertilizing an egg inside the female, to the female giving birth to an offspring that shares 50% of the squirrel’s genes.
With the sole exception of humans, no protein brain can imagine chains of causality that long, that abstract, and crossing that many domains. With one exception, no protein brain is even capable of drawing the consequential link from chewing and swallowing to inclusive reproductive fitness.
[...]
Why not learn to like food based on reproductive success, so that you’ll stop liking the taste of candy if it stops leading to reproductive success? Why don’t birds wait and see which wing-flapping policies result in more eggs, not just more stability?
Because it takes too long. Reinforcement learning still requires you to wait for the detected consequences before you learn.
Now, if a protein brain could imagine the consequences, accurately, it wouldn’t need a reinforcement sensor that waited for them to actually happen.
Put a food reward in a transparent box. Put the corresponding key, which looks unique and uniquely corresponds to that box, in another transparent box. Put the key to that box in another box. Do this with five boxes. Mix in another sequence of five boxes that doesn’t lead to a food reward. Then offer a choice of two keys, one which starts the sequence of five boxes leading to food, one which starts the sequence leading nowhere.
Chimpanzees can learn to do this. (Dohl 1970.) So consequentialist reasoning, backward chaining from goal to action, is not strictly limited to Homo sapiens.
But as far as I know, no non-primate species can pull that trick. And working with a few transparent boxes is nothing compared to the kind of high-falutin’ cross-domain reasoning you would need to causally link food to inclusive fitness. (Never mind linking reciprocal altruism to inclusive fitness). Reinforcement learning seems to evolve a lot more easily.
Now it’s clear, as was discussed yesterday, that it’s hard to build a powerful enough consequentialist. Natural selection sort-of reasons consequentially, but only by depending on the actual consequences. Human evolutionary theorists have to do really high-falutin’ abstract reasoning in order to imagine the links between adaptations and reproductive success.
I think it’s an Eliezer-neologism that was meant to highlight an analogy between moral consequentialism (‘you should base actions on their consequences’) and the kind of reasoning he’s talking about (‘you do base actions on their consequences’).
First reference I see to it is in Protein Reinforcement and DNA Consequentialism (2007, emphasis added):
Followed up in Thou Art Godshatter: