Your definition seems to constrain ‘epistemic process’ to mere analytic tasks. It seems to me that it’s a big leap from there to effective decision-making. For instance, I can imagine how LLMs could effective produce resolvable, non-conditional questions, and then answer them with relatively high accuracy. Yet there are three other tasks I’m more skeptical about: 1) generating conditional forecasting questions that encapsulate decision options; 2) making accurate probability judgements of those questions; and thus 3) the uptake of such forecasts into a ‘live’ decision process. This all seems more likely to work better for environments that seem to have discrete and replicable processes, some of which you mention, like insurance calculations. But these tasks seem potentially unsolvable by LLM for more complex decision environments that require more ethical, political, and creative solutions. By ‘creative’ I mean solutions (e.g. conditional forecasting question) that simply cannot be assembled from training data because the task is unique. What comprises ‘unique’ is perhaps an interesting discussion? Nevertheless, this post helped me work through some of these questions—thanks for sharing! Curious if you have any reactions.
DanSpoko
Yep. I agree with Abi. I also I think this is true in any industry. Or even just as a taxpaying citizen. It’s just really hard to have one’s ethics be completely aligned with anything. But exiting doesn’t make those ethical problems disappear. You just leave them for someone else to deal with.
Good questions. A few thoughts:
I think your assumptions are generally right, but I’d add one: One’s authority in a policy space is somewhat proportional to the number of other people claiming expertise. The junior staffer who’s been laboring on an otherwise ignored issue will skyrocket in value at the moment of crisis. For example, how many Ukraine experts were there last year compared with today? If that junior staffer can rise to the moment, they can launch their career on a new upward trajectory. Meanwhile, comparably few officials are working on the war in Yemen right now, which the UN has described as the “world’s worst humanitarian crisis.”
US Diplomats spend an average of a third of their career in DC and two-thirds abroad. In this manner, the foreign service offers you multiple perspectives. You’ll understand issues not only through the DC-centric lens, but also Beijing, Brussels, and Buenos Aires. The power lies in DC, but it’s harder to build real expertise and relationships sitting behind a computer screen in DC.
Building on the above answer, I’d ask: What’s your network for? Building expertise and influence are sadly not always the same thing. If you want to build expertise on an international issue, it may be a significant advantage to build a great international network on that issue. That’s not something you can build easily from home. Living/working in DC is a significant advantage for building influence in DC, but that game is much easier if you also have recognized expertise. The foreign service isn’t the only way to build up expertise and influence, but it’s a great one, and it’s more accessible the most other pathways. But hey, if someone offers you a job on the NSC, go ahead and take it!
An anecdote: the US government is trying to convince a foreign government to sign an agreement with the United States but is repeatedly stymied by presidents from both parties for two decades. Let’s assume a forecast at that moment suggests a 10% change the law will be passed within a year. A creative new ambassador designs a creative new strategy that hasn’t been attempted before. Though the agreement would require executive signature, she’s decides instead to meet with every single member of parliament and tell them the United States would owe them if they came out publicly in favor of the deal. Fast forward a year, and the agreement is signed.
Another anecdote: the invention of the Apple computer.
Presumably you could use LLM+scaffold to generate a range of options and compare conditional forecasts of likelihood of success. But will it beat a human? I’m skeptical that an LLM is ever going to be able to “think” through the layers of contextual knowledge about a particular challenge (say nothing of prioritizing the correct challenge in the first place) to be able to generate winning solutions.
Metric: give forecasters a slate of decision options—some calculated by LLM, some by humans—and see who wins.
Another thought on metrics: calculate a “similarity score” between a decision option and previous at solving similar challenges. Almost like a metric that calculates “neglectedness” and “tractability”?