Modelling Vantage Points

[This is a linkpost to Modelling vantage points]

In this article I present some selected passages from an article I have been writing over the last few months, studying optimal timing of interventions through a decision-theoretical lense.

I have put on a lot of work into this article, hoping that I would become less confused about some considerations on cause prioritization and intervention timing. However as the project grew I slowly realized that my approach was not particularly fruitful. I am publishing this (in a very unpolished form) hoping to get some feedback and spark some discussion around the topic.

ABSTRACT: We introduce the concept of vantage points, as points in the future to which we can delegate decision-making. We discuss some key considerations when deciding whether we should head up a vantage point or commit to action now: information gain, changes in option value, and value alignment. We introduce a graphical decision making model that captures those considerations. We prove some simple properties of the model that show that the model correctly captures some intuitions about decision making: positivity of information gain, negativity of value misalignment and the principle of maximization of option value. Lastly we perform a computational study of the sensitivity of the model to trade-offs between the parameters. Based on this study we suggest some interesting conjectures about the model, particularly we conjecture a characterization of a scenario where future increased option value disincentivizes waiting.

KEYWORDS: decision theory, normative uncertainty, option value, value alignment, probabilistic graphical models

EPISTEMIC STATUS: We think that model points to important considerations when choosing whether to delegate to future agents, but we have no reason to believe it is comprehensive or the best way to model them. In fact, we have omitted notions of progress in bounded rationality and reversibility of decisions which we believe to be important for decision-making. The conclusions we draw from the model do not seem particularly actionable or insightful. Nevertheless, the methodology we employed to study the question seems to be sound, and may be applicable somewhere else.

Conclusions

We have identified four key considerations for optimal intervention timing: possibility of information gain, option value changes, chances of value misalignment and changes in the quality of decision making.

We have proposed a simplified formal model that captures the first three factors and studied its formal solution. As part of our reasoning we introduce and justify some simplifications, such as treating changes in empirical and normative uncertainty equally.

We have shown that the model has three intuitive formal properties: information gain incentivizes waiting, chances of future misalignment incentivizes immediate action and changes in option value incentivize taking the decision when the option value is higher.

Lastly we have programmed a sampling tool we have used to study the tradeoff between different parameters of the model.

The results have mostly been unsurprising, albeit we have formulated one mildly interesting conjecture about the tradeoff between the option value over one outcome and chances of value alignment: depending on the relative value of the long term outcome whose option value is changing either more option value in the future will incentivize waiting or committing to immediate action.

This suggests a piece of actionable advice regarding long termist strategy.

If you currently expect one outcome to be better than the alternative and not expect to find much information changing your beliefs, and you expect that people in the future will have better chances of achieving it, wait before committing to irreversible action.

However if you expect that the outcome is worse than the alternative, and that people in the future will be however better positioned to pursue that bad alternative nonetheless, you should prefer to lock yourself in the good alternative now.

GRAPH 13: two randomly sampled scenarios. The x-axis varies a parameter that represents option value gain over a given long term outcome, while the y-axis varies the chances of value misalignment. The red area marks parameter combinations where waiting is preferred to immediate action. In the left graph the expected relative value of the long term outcome is positive, while in the right graph is negative.

If we situate this in the context of downside-focused cause prioritization [2], this illustrates an interesting scenario in which more option value actually incentivizes immediate action (when there is a plausible chance of misalignment and the expected value of the future over which we are gaining control is a priori negative).

Nevertheless we want to stress that this picture is quite incomplete (fundamentally, it misses out on modelling quality of decision making and cluelessness), and that we intended this study to be an aid to our intuitions rather than a replacement.

[Click here to read the rest of the article]

This article was written by Jaime Sevilla, visiting researcher at the Center for the Study of Existential Risks. This work was supported by a grant made by the Effective Altruism Foundation.

I want to thank Eric Chen for working with me in the early stages of the project. Hjalmar Wijk and Linh Chi Nguyen helped with discussing and ideas. Max Daniel provided feedback on an early draft, and Vojta Kovařík on a later one.