Maybe this is in other publications from P4E but from reading this I find it very difficult to understand whether and how the methodology applied leads to reliable prioritization choices.
Specifically:
-
It’s pretty unclear what you are optimizing for.
-
It’s pretty unclear how you aggregate evidence, e.g. most observers right now seem to agree that most risks vis-a-vis the 2026 elections lie in the tails (where the mainline case is that the elections at large will be pretty clear) so the methodology would likely need to weigh severity against probability (to calculate something like expected severity).
-
It’s pretty unclear what motivates criteria, e.g. whether or not something happened over the last 30 years is clearly not a great filter when protecting against authoritarian backsliding.
Overall my impression reading this—and I want to be clear that this is maybe more of a function of how it is written rather than the thing really being absent—is that there is no systematic methodology on how to get to high-impact interventions but more a series of plausibly-sounding filters and steps that, however, do not add up to a methodology that produces reliable prioritization.
Somewhat roughly, protecting against democratic backsliding is structurally similar to work on global catastrophic risks -- (a) most expected damage is in low-probability high severity incidents, (b) there is a trajectory dynamic, (c) threats are partially novel, (d) the ~RCT evidence base is thin but there are robust stylized facts that help with prioritization—and (in this situation) we should not expect a checklist-like approach to produce reliable recommendations because the central challenge of prioritization in high-uncertainty contexts is how to weigh and synthesize competing dynamics (such as likelihood v severity).
I can clarify the last point which is the most important one:
A reliable recommendation about a highest-leverage tactic would require a methodology that weighs different factors against each other taking into account that different factors have different spread and different weights, which is something a filtering check list is fundamentally unable to do.
Without the ability to quantify considerations, even if this means quantifying qualitative judgments, there is no way to make reliable recommendations because you try to integrate a disparate set of considerations which are—fundamentally—related to each other in a broadly multiplicative manner (see Effectiveness is a Conjunction of Multipliers for the clearest articulation of why that is).
A checklist approach destroys a lot of information and thereby misleads about relative importance, which is what you are trying to evaluate. Because some outcomes you are trying to alleviate are much worse than others, some are much more likely than others, some are much more tractable to attract than others, etc., a lot of this operates on variables that implicitly varying by orders of magnitudes across interventions and a checklist approach will massively under-represent the differences and is thus unlikely to point at the highest impact interventions.