When I zoom out on what sort of thing is happening when an agent engages in deliberative ladders it seems like they are struggling to deal with a multiplicative search space as an agent optimized for additive search spaces. Expanding on this. When I look at human cognition, the structure and limitations of working and associative memory, our innate risk tolerance and hyperbolic discounting, our pursuit of comparative advantage, as well as reasonable conjectures about the payoff distribution in the ancestral environment, I see an additive search space. That is to say that if you have a bunch of slot machines each with a different payout, we find the best one (or more accurately, the minimal set that will satisfy our needs along the various dimensions of payout) and keep pulling. In contrast, we now find ourselves in a potentially multiplicative search space. IE the payout of any given slot machine can (via sign considerations) potentially affects the payout of all others.
This drastically changes the calculus of the exploration-exploitation tradeoff. We’re not even sure that the problem is tractable because we don’t know the size of the search space. But one thing it definitely prescribes is dramatically more investment in exploration vs exploitation. The number of new crucial considerations discovered from such efforts might give us some data in the sense that if your trajectory asymptotes you have gained some knowledge about the search space, whereas if your trajectory remains spiky with large course corrections you suspect there is still a lot of value to further exploration.
What is the outside view of crucial consideration discovery? What sort of activities are people engaged in when they discover new candidate crucial considerations?
Another lens for looking at this is to say that qualitative model updates (where new distinctions are made) are drastically more important than quantitative model updates where you change the weight or value of some existing distinction within the model. This implies that if you find yourself investing in quantitative model disputes that there is more value elsewhere.
I believe it is possible to push on this by encouraging model diffing as in the recent thread on Alice and Bob’s discussion but with an added focus on comparing distinctions that the two people are making over comparing values/weights of those distinctions. Eventually gathering together a more explicit idea of what all the distinctions all the different people are making can potentially then be an input into structured analytic techniques useful for finding holes, such as taxonimization. Harvesting distinctions from the existing AI literature is one potentially useful input to this.
I am looking for people who have had similar thoughts to discuss this with as well as further discussion of search strategies.