Probabilities, Prioritization, and ‘Bayesian Mindset’

Sometimes we use explicit probabilities as an input into our decision-making, and take such probabilities to offer something like a literal representation of our uncertainty. This practice of assigning explicit probabilities to claims is sociologically unusual, and clearly inspired by Bayesianism as a theory of how ideally rational agents should behave. But we are not such agents, and our theories of ideal rationality don’t straightforwardly entail that we should be assigning explicit probabilities to claims. This leaves the following question open:

When, in practice, should the use of explicit probabilities inform our actions?

Holden sketches one proto-approach to this question, under the name Bayesian Mindset. ‘Bayesian Mindset’ describes a longtermist EA (LEA)-ish approach to quantifying uncertainty, and using the resulting quantification to make decisions in a way that’s close to Expected Value Maximization. Holden gestures at an ‘EA cluster’ of principles related to thought and action, and discusses its costs and benefits. His post contains much that I agree with:

  • We both agree that Bayesian Mindset is undervalued by the rest of society, and shows promise as a way to clarify important disagreements.

  • We both agree that there’s “a large gulf” between the theoretical underpinnings of Bayesian epistemology, and the practices prescribed by Bayesian Mindset.

  • We both agree on a holistic conception of what Bayesian Mindset is — “an interesting experiment in gaining certain benefits [rather than] the correct way to make decisions.”

However, I feel as though my sentiments part with Holden on certain issues, and so use his post as a springboard for my essay. Here’s the roadmap:

  • In (§1), I introduce my question, and outline two cases under which it appears (to varying degrees) helpful to assign explicit probabilities to guide decision-making.

  • I discuss complications with evaluating how best to approximate the theoretical Bayesian ideal in practice (§2).

  • With the earlier sections in mind, I discuss two potential implications for cause prioritization (§3).

  • I elaborate one potential downside of a community culture that emphasizes the use of explicit subjective probabilities (§4).

  • I conclude in (§5).

1. Philosophy and Practice

First, I want to look at the relationship between longtermist theory, and practical longtermist prioritization. Some terminology: I’ll sometimes speak of ‘longtermist grantmaking’ to refer to grants directed towards areas like (for example) biorisk and AI risk. This terminology is imperfect, but nevertheless gestures at the sociological cluster with which I’m concerned.[1]

Very few of us are explicitly calculating the expected value of our career decisions, donations, and grants. That said, our decisions are clearly informed by a background sense of ‘our’ explicit probabilities and explicit utilities. In The Case for Strong Longtermism, Greaves and MacAskill defend deontic (action-guiding) strong longtermism as a theory which applies to “the most important decision situations facing agents today”, and support this thesis with reference to an estimate of the expected lives that can be saved via longtermist interventions. Greaves and MacAskill note that their analysis takes for granted a “subjective decision theory”, which assumes, for the purposes of an agent deciding which actions are best, that the agent “is in a position to grasp the states, acts and consequences that are involved in modeling her decision”, who then decides what to do, in large part, based on their explicit understanding of “the states, acts and consequences”.

Of course, this is a philosophy paper, rather than a document on how to do grantmaking or policy in practice. The examples are going to be stylized. But I take it that, by stating a deontic thesis, Greaves and MacAskill are attempting to offer a decision problem which shares enough structural similarity with our actual decisions in order to usefully inform practice.

In practice, explicit EV calculations are often not the central feature of grantmaking. Nick Beckstead notes that BOTECs were used for a “minority” of Future Fund grants, although there are “standard numbers” he references “in the background of many applications”, like estimates for the probability and tractability of various x-risks. 80,000 Hours explicitly note their attempt to find “proxies” for expected value. Open Philanthropy prioritizes via worldview diversification, rather than explicit EV maximization, though cite (as one justification for worldview diversification) the convergence between worldview diversification and EV maximization under certain assumptions.

I raise these examples with the hope of painting a familiar picture: a picture where explicit probabilities play some role in determining LEA strategy, and play some role undergirding why we feel justified in doing what we do. With this in mind, I now want to look at what sort of story we may use to justify the use of explicit probabilities in practice.

1.1.

Suppose we’re playing poker, and I’m trying to understand the likely cards in the hand of my opponent. In this case, it looks as if I may do better by attempting explicitly calculate the probability that my opponent has a given hand, based on the information I’ve obtained from previous rounds. I endorse this strategy because I have a clear understanding of the game we’re playing, distribution of card types within a deck, and enough theory of mind to adequately reason about the likely behaviors of various players when they’re faced with given hands. If I don’t have these components, I start to lose faith in the value of assigning explicit probabilities.

To see this, imagine instead that I’m at the poker table, but this time I don’t know the rules of poker, nor do I know the composition of a standard deck of cards, or even the concept of a ‘card game’. Here’s one thing I could do: start by trying to construct explicit credences over my current, highly confused model of the state space, and choose the highest expected value action, based on my current understanding of what’s going on.

I wouldn’t do this, and probably nor would you. In practice, I’d be much more tempted to focus my efforts on first constructing the relevant state space, over which I may then start to assign more explicit probabilities. I’d ask various questions about what we’re even doing, and what the rules of the game are. With more detailed information in mind, I may then be able to understand what sort of state space I’m happy to treat as fixed, before I get to the task of assigning probabilities.

1.2.

So, what’s going on here?

If we retain the probabilistic language, we can describe two possible actions I could take:

  1. Treat my state space as fixed, and assign probabilities over that space.

  2. Construct a state space which better represents the structure of the target domain, and assign explicit probability assignments once I’m confident that my model of the world is accurate enough for probability assignments over that space to helpfully guide action.

Even if there’s a Bayes-optimal reconstruction of Option 2, explicit Bayesian behavior is very far from what I’m consciously doing, and diverges from Holden’s account of Bayesian Mindset. In the ‘confused poker’ example, it looks more like I’m getting my mind in enough order for the demands of Bayesianism to even be applicable. Bayesianism doesn’t apply to rocks, for rocks just don’t have the requisite structure for Bayesian demands to meaningfully apply. Prior to knowing anything about poker, I (plausibly) shouldn’t try to assign explicit probabilities, and then take the highest expected value actions. Instead, I should more plausibly engage in creating the setup required for explicit probability assignments to be useful for my purposes.

Philosophical arguments for probabilism are theories about the ideal behavior of agents who know, with certainty, that their state space captures every way the world could possibly be, and face the sole task of locating themselves among the coherent worlds. While this statement about Bayesianism isn’t conceptually groundbreaking, I do think it shows that we need to look for more explicit arguments for assigning explicit probabilities in practice, as we don’t get straightforward justifications for our practices directly from arguments about the behavior of ideal agents.

2. Okay, But Shouldn’t We Try to Approximate the Bayesian Ideal?

Maybe. I just think it’s deeply unclear what approximating the ideal actually looks like.

More specifically, I take it that one argument for assigning probabilities runs as follows: Bayesianism (let’s suppose) tells us how ideally rational agents should behave. We are not such agents. But we’d like to be as close as possible to such agents, and so we should try to mimic the behavior of ideal Bayesians, computational constraints permitting.

Even if the argument presented above is true, I think the downstream practical upshots of the argument are radically unclear. I don’t think approximation arguments deliver obvious implications for how we should actually behave, or obvious implications for whether we – in practical decision-making – should be assigning explicit probabilities.

2.1.

Richard Ngo presents a hypothetical dialogue between Alice and Bob. We can imagine that Bob has persuaded Alice that she ought to approximate the Bayesian ideal, which she then tries her diligent best to understand. I’ve lightly edited some components for brevity:

A: … Should I think about as many such skeptical hypotheses as I can, to be more like an ideal bayesian who considers every hypothesis?

B: Well, technically ideal bayesians consider every hypothesis, but only because they have infinite compute! In practice you shouldn’t bother with many far-fetched hypotheses, because that’s a waste of your limited time.

A: But what if I have some evidence towards that hypothesis? For example, I just randomly thought of the hypothesis that the universe has exactly googolplex atoms in it. But there’s some chance that this thought was planted in my mind by a higher power to allow me to figure out the truth! I should update on that, right?

B: Look, in practice that type of evidence is not worth keeping track of. You need to use common sense to figure out when to actually make the effort of updating.

A: Hmm, alright. But when it comes to the hypotheses I do consider, they should each be an explicit description of the entire universe, right, like an ideal bayesian’s hypotheses?

B: No, that’s way too hard for a human to do.

While Alice’s responses may appear silly, they raise an important point. Even if we’re persuaded that we should attempt to approximate the Bayesian ideal, we (in practice) bring in all sorts of auxiliary assumptions about what good approximation looks like, and these assumptions often move us away from behavior which appears to more superficially mimic the behavior of the Ideal Bayesian.

We can generalize this idea: the fact that ideal agents behave in some way B does not mean that we, who face a different set of constraints, ought to do something which looks like approximating behavior B. Richard’s dialogue provides an intuition pump in support of this conclusion. We can also support this line of argument with reference to some economic theory: we know from the theory of second best that if one optimality condition in a model (and practical applications of Bayesian reasoning are models, after all) cannot be satisfied, then the next-best solution may involve changing the values of other parameters away from values that would be optimal in the maximally ideal case.

In practice, this means that justifications for explicitly attempting to meet the criteria laid down by formal ideals will have to rely on some story that provides a reason for believing that marginal efforts towards mirroring these ideals actually gets us closer to ideal reasoning, given all of our other constraints.

2.2.

All that said, we have to act somehow. And we might think that, for most purposes, we don’t need a complicated story explaining why marginal shifts towards probabilistic language brings us closer to the ideal. Putting to one side more arcane theoretical discussions, we may claim that we obviously want to act in ways that remain sensitive to both stakes and likelihood.

Moreover, we know that the use of explicit probabilities and EV-type decision procedures have their uses, and we may further feel that both common-sense epistemology and common-sense decision theory are confused in myriad ways. With this in mind, we might think that it makes sense to treat Bayesian Mindset as something like the null hypothesis for good LEA decision-making. Indeed, this may have played a role in motivating a prior defense of practical EV-type reasoning, offered by Arden:

I think for me at least, and I’d guess for other people, the thing that makes the explicit subjective credences worth using is that since we have to make prioritisation decisions//​decisions about how to act anyway, and we’re going to make them using some kind of fuzzy approximated expected value reasoning, making our probabilities explicit should improve our reasoning.

Arden goes on to note that, by using explicit probabilities, we put “rational pressure” on our thinking in a way that’s often productive. If we’re to interpret the strength of Arden’s argument as a justification for our standard use of explicit probabilities, we’ll need a sense of what it means to say that we’re actually making decisions “using some kind of fuzzy approximated expected value reasoning”.

Let’s go back to the example where I’m sitting at the table, unaware of the rules of poker. Maybe, when I’m asking questions like “what’s the deal with this ‘deck of cards’?” and “what are we even doing here?”, I’m — ultimately, at some level — doing something like expected value reasoning. However, as I’m primarily interested in the use of explicit probabilities as forming one component of a practical decision procedure, I’ll assume (pretty reasonably, I think) that activities like ‘asking questions to learn the rules of the game’ fail to count as using “fuzzy approximated expected value reasoning”.

With that backdrop in mind, I interpret Arden as offering a justification for the use of explicit probabilities — relative to the LEA worldview. That is, I interpret Arden’s argument as justifying the use of explicit probabilities, so long as we assume that we’ve already got the shape of relevant considerations for affecting the long-term future in clear view. It assumes, to use a metaphor, that we’re willing to grant we’re actually playing poker.

For instance, if you’re unsympathetic to speculative-sounding stuff to begin with, or bored by all matters unrelated to the Anti-Capitalist Revolution, then Arden’s argument won’t (and, relative to your internal standards, shouldn’t) convince you to start assigning explicit probabilities over your extant picture of the world. Perhaps you think – given the large degree of uncertainty we have about the future – that people with LEAish values should be doing state space construction. Or perhaps you think, more generally, that the basic worldview which highlights the salient propositions over which LEAs assign probabilities is on the wrong track. I like Arden’s argument. I just think that, properly interpreted, it cannot serve as a neutral ground allowing us to adjudicate on the comparative benefits of using explicit probabilities compared to alternatives.

2.3.

So far, I’ve claimed that even if we ought to approximate the Bayesian ideal, it’s unclear what practical conclusions follow from this claim. Additionally, I’ve claimed that attempts to justify the employment of fuzzy EV-ish reasoning must be supported by an independent argument— an argument for believing that the practical state space over which we form probabilities is independently justified. In other words, I think that the practical adoption of explicit probabilities is justified by the assumption that our state space is (in some sense) accurate enough. Some creatures would be better served by mapping out the dynamic dependencies of the world, before they would benefit from assigning explicit probabilities. If we feel justified in using explicit probabilities to guide action, we must believe that we are not these creatures.

For ideally rational agents, there exists no tradeoff between the task of state space construction, and the assignment of explicit probabilities over that space. However, we — imperfect agents that we are — do face this tradeoff. In the next section, I’ll discuss how one might navigate this tradeoff in practice.

3. Mechanisms, Metaculus, and World-Models

Here’s what I’ve said so far:

  1. Claiming that the assignment of explicit probabilities is, in practice, the best way to approximate the Bayesian/​EV ideal is justified to the extent that we already possess world-models which are, in some sense, accurate enough.

In this section, I’ll attempt to resolve some of the vagueness inherent in my remarks, and suggest some more concrete implications. First, I’ll put forward one justification for the adoption of explicit probabilities in practice. Then, I’ll try to motivate the case for work which enables us to better understand more foundational worldview disagreements — specifically, I’ll suggest the need for more object-level, mechanistic models of AI trajectories, and more forecasting generalizability research.

3.1.

There are possible worlds under which treating these explicit probabilities as the inputs into explicit EV-ish calculations is a good decision procedure, and possible worlds under which this is a bad procedure, and we should instead work to refine our state space before assigning explicit probabilities. How do we tell which one we’re in?

Suppose we had the following evidence, in a world quite like our own:

  1. Forecasters were well-calibrated on near-term questions, and

  2. We had an understanding of the cognitive mechanisms which caused such forecasters to be well-calibrated, because such cognitive mechanisms were reliably correlated with the underlying dynamics driving near-term events, and

  3. We had reason to believe that the dynamics underlying near-term events share structural similarities with the dynamics underlying far-out events.

In this world, arguments for deferring to forecasts on near-term events more straightforwardly carry over to forecasts on farther out events. We have an output, a subjective probability estimate, and an understanding of the mechanism which generated that output. In this case, we’d have a principled mechanism which explained why we should trust subjective probability estimates. We’d have two distinct mechanistic stories (relating to forecasters’ psychology, and dynamics in the target domain), and a justification for treating the output of the forecaster as informative about the behavior of systems in the target domain. We’d have a correlation between two variables, and an argument for believing in the robustness of correlation. In this world, the case for deferring to long-term forecasts of well-calibrated near-term forecasters would be similar in kind to the case for believing that rain precedes wet sidewalks.

I believe that rain will continue to precede wet side-walks. I believe this not only due to the (really quite strong) correlational evidence, but also due to an understanding of the mechanism by which rain causes wet sidewalks — even if this mechanism is no more sophisticated than “rain is wet → wet things make certain kinds of uncovered dry things more wet”.

Without a comparable mechanism (and, let’s face it, it’s not rocket science) outlining why we should expect forecasts of long-term events to actually capture some relevant structure of the forecasted target domain, I don’t see why we should treat subjective probability estimates for far-out events as useful inputs into practical calculations. After all, maximizing expected value does better in the long-run only when your subjective probabilities are (or at least converge to) well-calibrated estimates for the likelihood of events within your complete state space. Without independent arguments for the calibration of our subjective probabilities, I don’t see the justification for believing that we will do better by taking probabilistic forecasts as providing the subjective probabilities we should use to guide action. And, after all, the case for using probabilities in practice relies on this practice actually helping us do better.

3.2.

Putting this into practice: I’m skeptical about most probabilistic forecasts for AGI timelines, especially when such forecasts are referenced without recourse to a more specific object-level model.

Of course, there are attempts to more explicitly model the timing or consequences of AGI, and many forecasters are aware of such models. But I remain skeptical of such models, in short, because I don’t see the independent justification for believing that forecasts stem from detailed enough models of the world which would justify treating subjective probabilities as meaningful.

Putting forward a sentiment like “more discussion of object-level models, fewer probabilities!” feels hard to rigorously justify, especially without reliance on somewhat idiosyncratic judgment calls. In summary, though: I mostly start from a place of skepticism – prior to examining the moving parts within these models – about the ability of explicit models of AI timelines to capture enough real-world structure for the derived estimates from such frameworks to deliver useful information. And, by and large, I don’t find myself convinced that the methods involved would retrodict past AI milestones, nor do I find myself convinced by object-level arguments for the methodological adequacy of the tools at hand to generate reliable forecasts.

My inner voice doesn’t respond to (e.g.) Cotra’s report by referencing skeptical priors about living in the most important century, but rather feels doubts about the underlying vision of the world in which the model functions— a vision wherein we act more judiciously, and develop better understandings of the world better by investing in forecasting models composed of many speculative subjective probability estimates. My inner voice is skeptical about the degree to which forecasts based on Cotra’s methodology (or similar) are capturing important dynamics allowing us to track the end results we care about, and feels nebulously uneasy about the way certain forecasts are informally discussed and presented. I feel uneasy with claiming there’s a “good case” for a >50% chance of transformative AI arriving within our lifetime, simply via “project[ing] forward available computing power, or [using] expert forecasts”. The track record of futurism isn’t great, and (despite the success of superforecasters) our evidence of forecasting calibration over longer-term horizons is limited — at the very least, earlier research by Tetlock highlighted the limited usefulness of expert forecasts from political scientists and economists.

When forecasting the arrival and dynamics of AGI, my first wish is to be presented with a default story of the future, and I want to see arguments for expecting our future to contain mechanisms present in such stories. When we assign subjective probabilities, these probabilities are always probabilities relative to some set of salient hypotheses. If we disagree on the construction of the hypothesis space, this disagreement is more foundational than disagreements about probability assignments. I think that many existing disagreements about AI timelines arise from fairly foundational differences about background worldviews and mechanisms which form the set of relevant hypotheses over which probabilities should be assigned, and fairly foundational disagreements over the appropriate epistemic strategies to enact in order to make progress on predicting the future.

According to Holden, the hope behind Bayesian Mindset is hope for more transparent disagreements, allowing us to “communicate clearly, listen open-mindedly, [and] learn from each other”. However, I think it’s plausible that centering probabilistic models in AI forecasting impedes our ability to “communicate clearly” and “learn from each other”. I think that operating with Bayesian Mindset can lead us to easily mischaracterize disagreements as disagreements about probabilistic estimates, rather than disagreements about the background framework defining the set of hypotheses over which we assign probabilities. Some examples:

  • While most people don’t have Yudkowskyan views on AI timelines, I think that many people outside of LEA would share his skepticism about the Biological Anchors report, if they’d devoted time to reading it.

    • Yudkowsky notes that the Biological Anchors model “contains supposedly known parameters”, like the “amount of computation an AGI must eat per second, and how many parameters must be in the trainable model for that, and how many examples are needed to train those parameters”.

    • According to Yudkowsky – even if the relevant biological assumptions of the report are correct – “the assumption that future people can put in X amount of compute and get out Y result is not something you really know.”

    • For this reason, Yudkowsky states that the framework offered by Cotra “is a trick that never works and, on its own terms, would tell us very little about AGI arrival times at all.”

  • Carlsmith forecasts the chance of “x-risk from AI by 2070”, rather than providing an independent timelines forecast. [2]

    • Carlsmith’s report involves assigning subjective probabilities to a variety of fuzzy, hard-to-formalize, and speculative claims. This sort of methodology is similarly unusual outside of LEA, and doesn’t seem to boast an obviously strong track record.

    • Moreover, I think that various reviewer disagreements are well-described as disagreements over the basic default story of the world that Carlsmith’s discussion operates within.

      • Reviewers note disagreements about the “framing of the probability steps”, worries about the ‘default’ dynamics in the report feeling either “wrong or not-obviously-meaningful”, and disputes about the broader conceptual framing of the report. This inclines me to believe that the primary source of disagreements are more inchoate conflicts in different background worldviews, rather than disagreements which are stated cleanly enough to be modeled as different probability assignments within a mutual world-model.

    • One more general objection to Carlsmith’s form of modeling might cite a background distrust of chains of reasoning with such imperfect concepts. When our concepts are fuzzy, we need to precisify our basic language to allow for statements precise enough to know what we’re assigning probabilities over, and what we’re holding fixed.

  • MIRI’s vision of the future is inspired by an alternative (less formally developed) model, ut this too (to the extent I understand it) is fairly unusual, relying on various law-like properties related to agency and rationality. The structural dynamics behind this story are dynamics which (as acknowledged by Yudkowsky) are hard for him to “Actually Explain” in a way that sticks.[3]

Imagine that I’ve encountered an open-minded, good faith religious mystic from the 18th Century. I wouldn’t attempt to orient the mystic towards Bayesian Mindset. Instead, I’d want to suggest an alternative frame for the mystic. I’d try to offer my default, mainline picture of the world as the world that they themselves are living in, and attempt to put forward the case for my background picture of the world as one which doesn’t include God. When faced with disagreements about background worldviews, attempting to frame disagreements as disagreements within your default world-model can impede understanding. In similar vein, I believe that effort invested into disputing various parameter estimates included in AI forecasting models (and even probabilistic weightings over different models) would be better directed towards explicating the underlying proto-arguments required to justify these models, and better directed towards more foundational defenses and criticisms of such models.[4]

Admittedly, more basic investigations into the underlying justificatory statuses of the formal models requires a much larger time-investment than simply providing one’s own explicit probabilities. This is unfortunate. But I think that it’s only with reference to such background world-models that we deliver useful information by offering subjective probabilities. Even if we’re in the business of deferring, I’d much rather know which world-models (rather than which distributions over timelines) are being deferred to.[5] When we defer to world-models, I think we at least have a better sense of who (or what methodology) is being trusted, and to what degree.

3.3.

Perhaps you’re convinced of longtermism, but skeptical about AGI. And perhaps you’re interested in less targeted (more ‘broad’) research to improve the long-term future. In this case, I think my discussion has a practical upshot for you: you should think that research on forecasting generalizability is neglected.

I’ve mentioned one arena in which I’m skeptical of forecasts. And, while I don’t think we should be deferring to AI timelines forecasts from Metaculus, I am happy with assigning explicit probabilities to various forecasts of near-term geopolitical events (e.g, wars between countries), for which I’d happily defer to aggregate predictions from forecasting platforms.

In geopolitical cases, we possess reasons to believe that our internal model of events within a given reference class incorporates mechanisms likely to emerge in various future political scenarios. The mechanisms which drove previous political leaders to war are likely to recur (to varying degrees) in future political conflicts. We have, additionally, domain-specific evidence on the calibration of certain forecasters, which suggests that certain forecasters are able to (imperfectly) track such mechanisms. In these cases, the probabilities offered by forecasters pass muster as useful probabilistic information, as we have evidence that the probabilistic forecasts arose as the result of a generating process which we expect to generalize for ordinary inductive reasons.

That said, I find it unclear just how far we should expect the calibration of superforecasters (and others) to generalize. While the results of aggregate forecasts from such platforms have reasonable success over one year time horizons (maybe longer, I’m interested in further evidence),[6] I think we need better mechanistic explanations about what’s going on with superforecasters in a way that explains why superforecasters are well-calibrated. Without a deeper mechanistic understanding of forecasting skill, I think we lack independent justification for allowing estimates from aggregate forecasting platforms to significantly shape practical prioritization decisions — at least beyond the narrow domains for which we have evidence about the calibration rates of forecasters.

In other words: without a more convincing mechanistic explanation as to why we should expect aggregate forecasts to reliably track real-world structure, I don’t think we have independent justifications for treating such forecasts as meaningful. We may hope that forecasting skill generalizes to longer time-horizons and more novel domains, but I think that’s it just that — a hope. Indeed, justifications for the usefulness of probabilities offered by well-calibrated (aggregate communities of) forecasters are the cases in which the justification for explicit probabilities is strongest. Most of us do not have evidence about the calibration of our forecasts, and so justifications for the use of explicit probabilities without evidence derived from independent track records rests on even shakier foundations.

I expect that many people, like me, will be excited by the potential of forecasting generalizability research to offer more useful tools to policy-makers concerned with the long-term future. I think that there’s an additional advantage to such research. On my view, research on forecasting generalizability is not just research which will be useful to policy-makers — it’s also research which can help us evaluate the domains in which Bayesian Mindset provides a useful tool. It’s research on the practical strengths and weaknesses of Bayesian Mindset.[7]

3.4.

So far, I’ve discussed how we might justify LEA’s use of explicit probabilities. In light of these remarks, I’ve outlined two potential implications for community-level research prioritization. We now move on to the penultimate section, where I’ll offer a (normatively laden) sociological diagnosis of LEA’s epistemic proclivities. In §4, I’ll claim that the widespread community adoption of probabilistic language has, in certain ways, obscured our epistemics.

4. Epistemic Gamification

Like Holden, I believe that:

“Bayesian mindset is an early-stage idea that needs a lot of kinks worked out if it’s ever going to become a practical, useful improvement for large numbers of people making decisions [when compared to what they would be doing otherwise]”

However, Holden describes himself as “an enthusiastic advocate for the Bayesian mindset”, whereas I feel more uneasy about describing myself that way. I feel much more comfortable describing myself as an advocate for marginal shifts towards Bayesian Mindset in the wider world, and an advocate for marginal shifts away from ‘Bayesian Mindset’ in LEA.

I sense that, compared to Holden, I see more substantial downsides to the use of Bayesian Mindset within LEA. I’ve suggested one: I believe that Bayesian Mindset sometimes directs our attention too quickly towards assigning explicit probabilities, and away from developing better models of the world.

4.1.

Thi Nguyen has a concept I quite like, called gamification. It might be helpful to think of it as a close cousin of Goodharting.

Nguyen introduces his concept by reflecting on games — in the sense of baseball, monopoly, catch, chess, or soccer, rather than the formal models of game theory. According to Nguyen, games are a special breed of activity under which you’re told, for a small, isolated period of time, what your values are. Games impose a local, externally given reward structure, in which you temporarily are told to only care about scoring points, or capturing the Bishop, or whatever else. Games are satisfying, because the world is complicated, and offer you the allure of a legible, externally given reward structure. The rules of the game become — just for one small moment — all of what you value.

Gamification is a social mechanism, piggybacking on Nguyen’s broader take on games. Social processes gamify our values when legible metrics swoop in and alter our values in a way that’s deleterious to our broader set of ends. That is, social processes gamify participants’ values when they behaviorally transform participants into someone who acts such that what they ultimately value is the proxy. Gamification is more like Goodharting’s close cousin than its twin, since Nguyen’s picture doesn’t explicitly refer to optimization. Instead, Nguyen’s account is more fuzzy. We don’t necessarily optimize for some proxy of our values, but rather become confused about (or lose touch with) what we fundamentally value.

To illustrate: imagine you hit the gym, with the aim of becoming healthier. Initially, ‘reducing body fat percentage’ seems like a reasonable proxy for this goal. Then, you end up pursuing this proxy beyond the point where meeting it’s actually useful. Nguyen lists other examples, too, but, for our present purposes, they’re immaterial. For our purposes, I want to claim that the widespread adoption of offering explicit subjective probabilities has, for many of us, gamified our epistemic values.

My ‘epistemic gamification’ claim rests on the assumption that the use of explicit probabilities makes us too quick to misdiagnose the character of our uncertainty, in a way that makes us worse at understanding the world. The centrality of probabilistic language can make it seem as if our primary sources of uncertainty were about the appropriate probabilities to assign to claims we explicitly consider, rather than the claims about the relevant questions to be asking. In earlier sections, I hinted at that claim. I suggested that the use of probabilities (in our new terminology) has gamified our epistemic values, through obscuring the implicit world-model that’s required for our probability assignments to mean anything at all. I suggested, further, that epistemic gamification leads us to mischaracterize disagreement on important questions, in a way that makes us worse at furthering our epistemic aims.

Without close attention to the role that explicit probabilities are meant to serve when they’re provided, I think that we will continue to be subject to epistemic gamification. I end with one final example.

4.2.

I saw a lot of LEA pushback in response to the following comment from Tyler Cowen:

“Hardly anyone associated with [The] Future Fund saw the existential risk to … [The] Future Fund, even though they were as close to it as one could possibly be … I am thus skeptical about their ability to predict existential risk more generally, and for systems that are far more complex and also far more distant.”

Examples of LEA pushback on this remark can be seen here, here, and here. Critical reception to a similar point can also be seen here. I interpret the linked responses as suggesting that a failure to foresee the risks of FTX’s collapse is basically uncorrelated with a failure to accurately model existential risks more broadly. When LEAs object to Cowen’s remark, I assume they object for similar reasons to Katja Grace: objectors are likely to think that the concept of “managing tail risks … seems like too broad a category” to be a useful object of analysis.

The more I’ve reflected on Cowen’s comment, the more I’ve thought that Cowen was onto something. In brief, I take Cowen to be claiming that FTX’s collapse provides evidence about the failure of the LEA movement writ large to focus on the right hypothesis space, which in turn provides reason to doubt that the model-based estimates of existential risks are operating within the right hypothesis space. If you’re part of a movement with a very general concern for doing the most good, you have to find some way to narrow your space of legitimate hypotheses. When evidence pertaining to the inadequacy of your state space for predicting the future arises, this is at least some evidence against the adequacy of whatever mechanism you use, in practice, to choose your practical hypothesis space. It suggests that you’re liable to miss important considerations.

4.3.

Here’s a parodic rebuttal from Scott Alexander:

“Leading UN climatologist Dr. John Scholtz is in serious condition after being wounded in the mass shooting at Smithfield Park. Scholtz claims that his models can predict the temperature of the Earth from now until 2200 - but he couldn’t even predict a mass shooting in his own neighborhood. Why should we trust climatologists to protect us from some future catastrophe, when they can’t even protect themselves in the present?”

According to me, this parodic rebuttal doesn’t impugn Cowen’s point, because LEA (unlike the climatologist) wishes to make very general, all-things-considered judgments about relative risks from many diverse cause areas, without reference (as in the case of climatology) to similarly detailed mechanistic models pertaining to future risks. The climatologist has explicitly chosen a narrow domain of application, in which (if they’re similar to most actual climatologists) they reason in a distinctive way. Consequently, the evidence from the shooting provides little information about the degree to which we should trust the individual climatologist’s more narrow mode of reasoning. If, instead, the climatologist operated with an unusual (and largely untested) practical epistemology — and this practical epistemology was applied with more global scope – then I think we have a much stronger case for questioning the climatologist’s more novel mode of reasoning.

I take (the best version of) Cowen to be claiming that there’s a distinct skill of shrewd judgment, which plays a role in forming the space of salient options to be considered. If the LEA community (by and large) failed to consider the downfall of FTX as a salient option to consider — despite the incentives to do so — this constitutes one reason to believe that the current hypothesis space with which LEA operates is lacking certain crucial details. It suggests that our (formal and informal) epistemic institutions were insufficiently attentive to considerations outside LEA’s default frame. It doesn’t necessarily suggest a failure in the estimates of x-risk for any specific cause area, but I think it does suggest some (potentially correctable) failing in forming the optimal space of hypotheses to explicitly consider.[8]

I think one component of ‘shrewd judgment’ involves noticing when your background framework is ill-posed, and consequently using this information to rethink your model. This skill is distinct from something more akin to ‘standard Bayesian updating within your current world-model’. It’s less about locating ourselves among the coherent worlds, and more about noticing subtle incoherencies lurking in the initial setup of our problem. Nate Soares puts a similar point nicely:

If you have 15 bits of evidence that Mars is in the east, and 14 bits of evidence that Mars is in the west, you shouldn’t be like, “Hmm, so that’s one net bit of evidence for it being in the east” and call it a day. You should be like, “I wonder if Mars moves around?”

“Or if there are multiple Marses?”“Or if I’m moving around without knowing it?”

Failing that, at least notice that you’re confused and that you don’t have a single coherent model that accounts for all the evidence.

So, how much should we update on Cowen’s remarks, and how much does this relate to our use of explicit probabilities? This feels hard to answer with a great degree of quantitative rigor. Certainly, other communities (who, no doubt, would fare worse on all sorts of other epistemic dimensions) would be more wary of high-profile individuals involved who acquired wealth from crypto as quickly as SBF. All I can say, overall, is that it seems plausible to me that LEA was operating with a set of gamified epistemic values, in a way that made us worse at searching for hypotheses outside our practical world model.

4.4.

I don’t want to lose sight of the value offered by thinking probabilistically, or using EV-ish decision procedures. Of course, we want to prioritize according to stakes and likelihood. In many cases, that will require getting quantitative, and doing something like EV-maximization in practice. I just think it’s important to engage in such practices self-consciously, avoiding the temptation to reflexively default to explicitly probabilistic language before we have an agreed upon sense of what we’re holding fixed.

I do not wish to forego all use of probabilistic language. I wish wider society were more probabilistic, and attempts to gain a better handle on our uncertainty are laudable. Instead, I wish to provide a better conception of what we’re trying to do when using probabilistic language, alongside a first-pass, exploratory investigation into the pragmatic costs and benefits to using explicit probabilities. There are many benefits to the use of explicit probabilities. But, plausibly, there are also downsides to the widespread community adoption of explicit probabilities. One such downside, I think, can be expressed as follows: the widespread adoption of explicit probabilities has gamified LEA’s epistemic values.

5. Conclusion

I’ve discussed the relationship of Bayesian Mindset to the underlying formal frameworks which motivate the practice of Bayesian Mindset. I’ve also claimed that a more explicit understanding of the relationship between theory and practice has implications for prioritization — we should engage in more focused research directed towards our underlying world-models, and more research specifically focused on testing the limits of our implicit background frameworks.

I’ll end with one final claim. I think, with my pragmatic view of probabilities in mind, we possess a story of why we’re ultimately justified in not treating the outputs of more speculative subjective EV estimates literally, without recourse to metaphors like ‘stepping off the Crazy Train’. In my terminology, we step off the Crazy Train when we no longer trust that the probabilities we assign meet the necessary background conditions required to justify the usefulness of probability assignments. We step off the Crazy Train when we enter a domain in which we have no reason to trust that our state space is well-formed enough for explicit EV estimates to tell us anything at all.

I think the Crazy Train metaphor highlights something important. I’ve pondered it a lot, and I now don’t think decisions to step off the Crazy Train reveal a squeamish inability to properly follow through on a theoretically principled analysis. Stops along the train to Crazy Town simply represent the place where we see the practical limits to a certain kind of quasi-formal reasoning. When we alight the Crazy Train, I now feel, more confidently, that this is the epistemically appropriate response, and one that doesn’t require efforts attempting to shoehorn a justification for my (our) actions into a successor theory which shares a similar formal structure to expected value maximization. Instead, I take it to be an honest acknowledgment that my (our) theoretical commitments go deeper than sums of products, or clever theoretical constructions on top of sums of products. In the words of Wittgenstein:

A picture held us captive. And we could not get outside it, for it lay in our language and language seemed to repeat it to us inexorably.

  1. ^

    Many people (including Dustin) justify focus on areas like biorisk and AI in virtue of the risks posed to the present generation. However, I stick with the terminology of ‘longtermist’ grantmaking, because: (i) my discussion focuses on areas that (philosophical) longtermists tend to prioritize, and (ii) I’m focused on sociologically unusual applications of Bayesian Mindset; people who prioritize biorisk and AI risk based primarily on short-term considerations do so on the basis of an unusual set of cognitive tools (like treating speculative probability estimates seriously), which share much in common with Holden’s account of Bayesian Mindset.

  2. ^

    Carlsmith instead defers to timelines estimates from prior reports, of which Biological Anchors is one.

  3. ^

    One further example not included in the main text:

    Davidson’s Framework for Takeoff Speeds

    Davidson uses a semi-endogenous growth model to forecast how R&D investments affect technological progress, given estimates for (among many other parameters) AGI training requirements, and the ‘effective FLOP gap’ between the ‘most demanding’ and ‘80th percentile demanding’ tasks. In short, I’m not convinced by the track record of semi-endogenous growth models within economics, and don’t see much in the way of principled reasons for trusting the forecasts of such models for takeoff speeds.

    Davidson thinks his modeling strategy is the “~best you can do” when making technological predictions from R&D investment, though he also believes that simply saying “I just don’t trust any method that tries to predict the rate of technological progress from the amount of R&D investment” is a “valid perspective”.

  4. ^

    Indeed, there are various foundational criticisms in this genre that I think do meet this bar — see Linn, Soares, nostalgebraist, and Yudkowsky. The 2021 MIRI Conversations are perhaps the best example of the ‘worldview explication’-type projects I’m most enthused by.

  5. ^

    I like some of Sam Clarke’s suggestions for communicating deference over AI timelines.

  6. ^

    Vasco Grilo’s more recent post also provides a nice summary of some relevant evidence.

  7. ^

    In one EAG interview, Nick Beckstead claimed that there were “certain vibes of careful and precise reasoning” he believed to be societally neglected. I think that research on forecasting generalizability, in alternative words, is research on the degree to (and domains under) which LEA’s implicit conception of “careful reasoning vibes” are vibes which actually help us navigate the world more successfully.

  8. ^

    Since drafting, a new dispute has emerged between Scott and Tyler on AI risk. I’m not a huge fan of either take, but I think Tyler’s response is illustrative: “[Scott’s] restatement of my argument is simply not what I wrote. Sorry Scott! There are plenty of arguments you just can’t put into the categories outlined in LessWrong posts.”

    On my preferred reading, Tyler is criticizing Scott for an overly colonizing use of Bayesian Mindset. Tyler is suggesting that something is awry with the background picture in which probabilistic estimates of questions relevant to AI risk are formed, and with the attempt to interpret disagreements as disagreements about probabilistic estimates within some shared picture of the world, rather than a more foundational disagreement about the benefits of a particular way of approaching practical epistemology.