This piece defends a strong form of epistemic modesty: that, in most cases, one should pay scarcely any attention to what you find the most persuasive view on an issue, hewing instead to an idealized consensus of experts. I start by better pinning down exactly what is meant by ‘epistemic modesty’, go on to offer a variety of reasons that motivate it, and reply to some common objections. Along the way, I show common traps people being inappropriately modest fall into. I conclude that modesty is a superior epistemic strategy, and ought to be more widely used—particularly in the EA/rationalist communities.
In virtually all cases, the credence you hold for any given belief should be dominated by the balance of credences held by your epistemic peers and superiors. One’s own convictions should weigh no more heavily in the balance than that of one other epistemic peer.
Introductions and clarifications
A favourable motivating case
Suppose your mother thinks she can make some easy money day trading blue-chip stocks, and plans to kick off tomorrow shorting Google on the stock market, as they’re sure it’s headed for a crash. You might want to dissuade her in a variety of ways.
You might appeal to an outside view:
Mum, when you make this short you’re going to be betting against some hedge fund, quant, or whatever else. They have loads of advantages: relevant background, better information, lots of data and computers, and so on. Do you really think you’re odds on to win this bet?
Or appeal to some reference class:
Mum, I’m pretty sure the research says that people trying to day-trade stocks tend not to make much money at all. Although you might hear some big successes on the internet, you don’t hear about everyone else who went bust. So why should you think you are likely to be one of these remarkable successes?
Or just cite disagreement:
Look Mum: Dad, sister, the grandparents and I all think this is a really bad idea. Please don’t do it!
Instead of directly challenging the object level claim (i.e. “Google isn’t overvalued, because X”). These considerations attempt to situate the cogniser within some population, and from characteristics of this population infer the likelihood of this cogniser getting things right.
Call the practice of using these techniques considerations epistemic modesty. We can distinguish two components:
‘In theory’ modesty: That considerations of this type should in principle influence our credences.
‘In practice’ modesty: That one should in fact use these considerations when forming credences.
Weaker and stronger forms of modesty
Some degree of modesty is (almost) inarguable. If one leaves for work on Tuesday and finds all your neighbours left their bins out, that’s at least reason to doubt your belief bins were on Thursday, and perhaps sufficient to believe instead bins are on Tuesday (and follow suit with your bins). If it appears that, say, the coagulation cascade ‘couldn’t evolve’, the near unanimity of assent for evolution among biologists at least counts against this, if not a decisive reason, despite one’s impressions, that it could. Nick Beckstead suggests something like ‘elite common sense’ forms a prior which one should be hesitant to diverge from without good reason.
I argue for something much stronger (c.f. the Provocation above): in theory, one’s credence in some proposition P should be almost wholly informed by modest considerations. That, ceteris paribus, the fact it appears to you that P should weigh no more heavily in one’s determination regarding P than knowing that it appears to someone else that P. Not only is this the case in theory, but it is also the case in practice. One’s all things considered judgement on P should be just that implied by an idealized expert consensus on P, no matter one’s own convictions regarding P.
Motivations for more modesty
Why believe ‘strong form’ epistemic modesty? I first show families of cases where ‘strong modesty’ leads to predictably better performance, and show these results generalise widely.[1]
The symmetry case
Suppose Adam and Beatrice are perfect epistemic peers, equal in all respects which could bear on them forming more or less accurate beliefs. They disagree on a particular proposition P (say “This tree is an Oak tree”). They argue about this at length, such that all considerations Adam takes to favour “This is an Oak tree” are known to Beatrice, and vice versa.[2] After this, they still disagree: Adam has a credence of 0.8, Beatrice 0.4.
Suppose an outside party (call him Oliver) is asked for his credence of P, given Adam and Beatrice’s credences and their epistemic peer-hood to one another, but bereft of any object-level knowledge. He should split the difference between Adam and Beatrice − 0.6: Oliver doesn’t have any reason to favour Adam over Beatrice’s credence for P as they are epistemic peers, and so splitting the difference gives the least expected error.[3] If he was faced with a large class of similar situations (maybe Adam and Beatrice get into the same argument for Tree 2 to Tree 10,000) Oliver would find that difference splitting has lower error than biasing to either Adam or Beatrice’s credence.
Adam and Beatrice should do likewise. They also know they are epistemic peers, and so they should also know that for whatever considerations explain their difference (perhaps Adam is really persuaded by the leaf shapes, but Beatrice isn’t) Adam’s take and Beatrice’s take are no more likely to be right than one another. So Adam should go (and Beatrice vice-versa), “I don’t understand why Beatrice isn’t persuaded by the leaf shapes, but she expresses the same about why I find it so convincing. Given she is my epistemic peer, ‘She’s not getting it’, and, ‘I’m not getting it’ are equally likely. So we should meet in the middle”.
The underlying intuition is one of symmetry. Adam and Beatrice have the same information. The correct credence regarding P given this information should not depend on which brain Adam or Beatrice happens to inhabit. Given this, they should hold the same credence[4], and as they Adam is as likely to be further from the truth than Beatrice, the shared credence should be in the middle.
Compressed sensing of (and not double-counting) the object level
It seems odd that both Adam and Beatrice do better discarding their object level considerations regarding P. If we adjust the scenario above so they cannot discuss with one another but are merely informed of each other’s credences (and that they are peers regarding P), the right strategy remains to meet in the middle.[5] Yet how come Adam and Beatrice are doing better if they ignore relevant information? Both Adam and Beatrice have their ‘inside view’ evidence (i.e. what they take to bear on the credence of P) and the ‘outside view’ evidence (what each other think about P). Why not use a hybrid strategy which uses both?
Yet to whatever extent Adam or Beatrice’s hybrid approach leads them to diverge from equal weight, they will do worse. Oliver can use the ‘meet in the middle strategy’ to get an expectedly better accuracy than either biasing towards their own inside view determination. In betting terms, Oliver can arbitrage any difference in credence between Adam and Beatrice.
We can explain why: the credences Adam and Beatrice offer can be thought of as very compressed summaries of the considerations they take to bear upon P. Whatever ‘inside view’ considerations Adam took to bear upon P are already ‘priced in’ to the credence he reports (ditto Beatrice). Modesty is not ignoring this evidence, but weighing it appropriately: if Adam then tries to adjust the outside view determination by his own take on the balance of evidence, he double counts his inside view: once in itself, and once more by including his credence as weighing equally to Beatrice’s in giving the outside view.
One’s take on the set of considerations regarding P may err, either by bias,[6] ignorance, or ‘innocent’ mistake. Splitting the difference between you and your peer’s very high level summary of these captures the great fraction of benefit of hashing out where these summaries differ.[7] Modesty correctly diagnoses that one’s high level summary is no more likely to be more accurate than one’s peers, and so holds those in equal regard, even in cases where the components of one’s own summary are known better.
Repeated measures, brains as credence censors, and the wisdom of crowds
Modesty outperforms non-modesty in the n=2 case. The degree of outperformance grows (albeit concavely) as n increases.
Scientific fields often have to deal with unreliable measurement. They commonly mitigate this by having repeat measurement. If you have a crummy thermometer, repeating readings several times improves accuracy over just the once. Human brains also try and measure things, and they are also often unreliable. It is commonly observed that nonetheless the average of their measurement tends to lie closer to the mark than the vast majority of individual measurements. Consider the commonplace ‘guess how many skittles are in this jar’ or similar estimation games: the usual observation is that the average of all the guesses is better than all (or almost all) the individual guesses.
A toy model makes this unsurprising. The individual guesses will form some distribution centered on the true value. Thus the expected error of a given individual guess is the standard deviation of this distribution. The expected error of the average of all guesses is given by the standard error, which is the standard deviation divided by root(number of guesses):[8] with 10 individuals, the error is about 3 times smaller than the expected error of each individual guess; with 100, 10 times smaller; and so on.
Analogously, human brains also try to measure credences or degrees of belief, and are similarly imperfect to when they’re trying to estimate ‘number of X’. Yet one may expect a similar effect to this ‘wisdom of crowds’ to operate here too. In the same way Adam and Beatrice would do better in the situation above if they took the average (even if it went against their view of the balance of reasons by their lights), if Adam-to-Zabaleta (all epistemic peers) investigated the same P, they’d expect to do better if they took the average of their group versus steadfastly holding to the credence they arrived at ‘by their lights’. Whatever inaccuracies that may throw off their individual estimates of P somewhat cancel out.
Deferring to better brains
The arguments above apply to cases where one is an epistemic peer. If not, one needs to adjust by some measure of ‘epistemic virtue’. In cases where Adam is an epistemic superior to Beatrice, they should meet closer to Adam’s view, commensurate with the degree of epistemic superiority (and vice versa).
Although reasons for being an epistemic superior could be ‘they’re a superforecaster’ or ‘they’re smarter than I am’, perhaps the most common source of epistemic superiors lie under the heading of ‘subject matter expert’. On topics from human nutrition, to voting rules, to the impact of the minimum wage, to the nature of consciousness, to basically anything that isn’t trivial, one can usually find a fairly large group of very smart people who spend many years studying that topic, who make public their views about this topic (sometimes not even behind a paywall). That they at least have a much greater body of relevant information and have spent longer thinking about it gives them a large advantage compared to you.
In such cases, the analogy might be that your brain is a sundial, whilst theirs is an atomic clock. So if you have the option of taking their readings rather than yours, you should do so. The evidence a reading of a sundial provides about the time conditional on the atomic clock reading is effectively zero. ‘Splitting the difference’ in analagous epistemic cases should result with both you and your epistemic superior agreeing that they are right and you are wrong.
Inference to the ideal epistemic observer
We can summarise these motivations by analogy to ideal observers (used elsewhere in perception and ethical theory). We can gesture that an ideal (epistemic) observer is just that which is able to form the most accurate credence for P given whatever prior: we can explain they have vast intelligence, full knowledge of all matters that bear upon P, perfect judgement, and in essence all epistemic virtues inexcelsis.
Now consider this helpful fiction:
The epistemic fall:Imagine a population solely comprised of ideal observers, who all share the same (correct) view on P. Overnight their epistemic virtues are assailed: they lose some of their reasoning capacity; they pick up particular biases that could throw them one way or another; they lose information, and so on, and each one to varying degrees.
They wake up to find they now have all sorts of different credences about P, and none of them can remember what credence they all held yesterday. What should they do?
It seems our fallen ideal observers can begin to piece together what their original credence was about P by finding out more about their credences and remaining epistemic virtue, and so backpropagate their return to epistemic apotheosis. If they find they’re all similarly virtuous and are evenly scattered, their best guess is the ideal observer was in the middle of the distribution (c.f. the wisdom of crowds). If they see a trend that those with greater residual virtue tend to hold a higher credence in P, they should attempt to extrapolate this trend to suggest the ideal agent origin from which they were differentially blown of course from. If they see one group demonstrates a bias that others do not, they can correct the position of this group before trying these procedures. If they find the more virtuous agents are more scattered regarding P, (or that they segregate into widely dispersed aggregations), this should make them very unsure about where the ideal observer initially was. And so on.
Such a model clarifies the benefit of modesty. Although we didn’t have some grand epistemic fall, it is clear we all fall manifestly short of an ideal observer. Yet we all fall short in different respects, and in different degrees. One should want to believe whatever one would believe if one was an ideal observer, shorn of one’s manifest epistemic vices. Purely immodest views must say their best guess is the ideal observer would think the same as they do, and hope that all the vicissitudes of their epistemic vice happen to cancel out. By accounting for the distribution of cognisers, modesty allows a much better forecast, and so a much more accurate belief. And the best such forecast is the strong form of modesty, where one’s particular datapoint, in and of itself, should not be counted higher than any other.
Excursus: Against common justifications for immodesty
So much for strong modesty in theory. How does it perform in practice?
One rough heuristic for strong modesty is this: for any question, find the plausible expert class to answer that question (e.g. if P is whether to raise the minimum wage, talk to economists). If this class converges on a particular answer, believe that answer too. If they do not agree, have little confidence in any answer. Do this no matter whether one’s impression of the object level considerations that recommend (by your lights) a particular answer.
Such a model captures all the common sense cases of modesty—trust the results in typical textbooks, defer to consensus in cases like when to put the bins out, and so on. I now show it is also better in many cases where people think it is better to be immodest.
Being ‘well informed’ (or even true expertise) is not enough
A common refrain is that one is entitled to ‘join issue’ with the experts due to one having made some non-trivial effort at improving one’s knowledge of the subject. “Sure, I accept experts widely disagree on macro-economics, but I’m confident in neo-Keynesianism after many months of careful study and reflection.”
This doesn’t fly by the symmetry argument above. Our outsider observes widespread disagreement in the area of macroeconomics, and that many experts who spend years on the subject nonetheless greatly disagree. Although it is possible the ideal observer would have been in one or another of the ‘camps’ (the clustering implies intermediate positions are less plausible), the outsider cannot adjudicate which one if we grant the economists in each appear to have similar levels of epistemic virtue. The balance of this outside view changes imperceptibly if another person who despite a few months of study remains nowhere near peerhood (let alone superiority) of these divided experts, happens to side with one camp or another. By symmetry, one’s own view of the balance of reason should remain unchanged if this ‘another person’ happened to be you.
The same applies even if you are a bona fide expert. Unless the distribution of expertise is such that there is a lone ‘world authority’ above all others (and you’re them) your fellow experts form your epistemic peer group. Taking the outside view is still the better bet: the consensus of experts tends to be right more often than dissenting experts, and so some difference splitting (weighed more to the consensus owing to their greater numbers) is the right answer.[9]
Common knowledge ‘silver bullet arguments’
Suppose one takes an introductory class in economics. From this, one sees there must be a ‘knock-down’ argument against a minimum wage:
Well, suppose you’re an employee whose true value on the free market is less than the minimum wage. But under the minimum wage, the firm might not decide on charitably employing above your market value, and just firing you instead. You’re worse off, as you’re on the dole, and the firm’s worse off, as it has to meet its labour demand another way. Everyone’s lost! So much for the minimum wage!
Yet one quickly discovers economists seem to be deeply divided over the merits of the minimum wage (as they are about most other things). See for example this poll suggesting 38 economic experts in the US are pretty evenly divided on whether the minimum wage would ‘hit’ employment for low-skill workers, and leant in favour of the minimum wage ‘all things considered’.
It seems risible to suppose these economists don’t know their economics 101. What seems much more likely is that they know other things that you don’t which make the minimum wage more reasonable than your jejune understanding of the subject suggests. One need not belabour which side the outside view strongly prefers.
Yet it is depressingly common for people to confidently hold that view X or Y is decisively refuted by some point or another, notwithstanding the fact this point is well known to the group of experts that nonetheless hold X or Y. Of course in some cases one really has touched on the decisive point the experts have failed to appreciate. More often, one is proclaiming that one is on the wrong side of the Dunning-Kruger effect.
Debunking the expert class (but not you)
To the litany of cases where (apparent) experts screwed up, we can add verses without end. So we might be inclined to debunk a particular ‘expert consensus’ due to some bias or irrationality we can identify. Thus, having seen there are no ‘real’ experts to help us, we must look at the object level case.
The key question is this: “How are you better?” And it is here that debunking attempts often flounder:
An undercutting defeater for one aspect of epistemic superiority for the expert class is not good enough. Maybe one can show the expert class has a poor predictive track record in their field. Unless one has a better track record in their field, this puts you on a par with respect to this desideratum of epistemic virtue. They likely have others (e.g. more relevant object-level knowledge) that should still give them an edge, albeit attenuated.
An undercutting defeater that seems to apply equally well to oneself as the expert class also isn’t enough. Suppose (say) economics is riven by ideological bias: why are you less susceptible to these biases? The same ideological biases that might plague professional economists may also plague amateur economists, but the former retain other advantages.
Even if a proposed debunking is ‘selectively toxic’ to the experts versus you, it still might be your epistemic superior all things considered. Both Big Pharma and Professional Philosophy may be misaligned, but perhaps not so much to be orthogonal or antiparallel to the truth: in both they still expectedly benefit by finding drugs that work or making good arguments respectively. They may still fare better overall than, “Intelligent layperson who’s read extensively”, even if they are not subject to ‘publish or perish’ or similar.
Even if a proposed debunking shows one as decisively superior to that expert class, there may be another expert class which remains epistemically superior to you. Maybe you can persuasively show professional philosophers are so compromised on consciousness that they should not be deferred to about it. Then the real expert class may simply switch to something like ‘intelligent people outside the academy who think a lot about the topic’. If it’s the case that this group of people do not share your confidence in your view, it seems outsiders should still reject it—as should you.
It need not be said that the track record for these debunking defeaters is poor. Most crackpots have a persecution narrative to explain why the mainstream doesn’t recognise or understand them, and some of the most mordant criticisms of the medical establishment arise from those touting complementary medicine. Thus ‘explaining away’ expert disagreement may not put one in a more propitious reference class than one started from. One should be particularly suspicious of debunking(s) sufficiently general that the person holding the unorthodox view has no epistemic peers—they are akin to Moses, descending from Mt. Sinai, bringing down God-breathed truth for the rest of us.[10]
Private evidence and pet arguments
Suppose one thinks one is in receipt of a powerful piece of private evidence: maybe you’ve got new data or a new insight. So even though the experts are generally in the right, in this particular case they are wrong because they are unaware of this new consideration.
New knowledge will not spread instantaneously, and that someone can be ‘ahead of the curve’ comes as no surprise. Yet many people who take themselves to have private evidence are wrong: maybe experts know about it but don’t bother to discuss it because it is so weak, or it is already in the literature (but you haven’t seen it), or it isn’t actually relevant to the topic, or whatever else. Most mavericks who take themselves to have new evidence that overturns consensus are mistaken.
The natural risk is people tend to be too partial to their pet arguments or pet data, and so give them undue weight, and so one’s ‘insider’ perceptions should perhaps be attenuated by this fact. I suspect most are overconfident here.[11] If this private evidence really is powerful, one should expect it to be persuasive to members of this expert class once they become aware of it. So it seems the credence one should have is the (appropriately discounted) forecast of what the expert class would think once you provide them this evidence.
The natural test of the power of this private evidence is to make it public. If one observes experts (or just epistemic peers) shift to your view, you were right about how powerful this evidence was. If instead one sees a much more modest change in opinion, this should lead one to downgrade your estimate as to how powerful this evidence really is (and perhaps provide calibration data for next time). Holding instead this really is decisive evidence leads one to the problematic ‘common knowledge silver bullet’ case discussed above. Inferring from this experts just can’t understand your reasoning or are biased against outsiders or whatever else produces a suspiciously self-serving debunking argument, also discussed above.
Objections
So much for the case in favour. What about the case against? I divide objections into those ‘in theory’, and those ‘in practice’.
It is not the case you can bootstrap an outside view from nothing. One needs to at least start with some considerations as to what makes one an epistemic peer or superior, and probably some minimal background knowledge of ‘aboutness’ to place topics under one or another expert class.
In the same way large amounts of our empirical information are now derived by instrument rather than direct application of our senses (but were ultimately germinated from direct sensory experience), large amounts of our epistemic information can be derived by deferring to better (or more) brains rather than using our own, even if this relies on some initial seed epistemology we have to realise for ourselves. This ‘germinal set of claims’ can still be modestly revised later.
Immodestly modest?
One line of attack from the social epistemology literature is that strong forms of modesty are self-defeating. If one is modest, one should assumedly be modest about ‘What is the right way to form beliefs if epistemic peers disagree with you?’ Yet one finds that very few people endorse the sort of epistemic modesty advocated above. When one looks among potential expert classes, such as more intelligent friends of mine (i.e. friends of mine), epistemologists, and so on, conciliatory views like these command only a minority. So the epistemically modest should vanish as they defer to the more steadfast consensus.
If so, so much the worse for modesty. I offer a couple of incomplete defences:
One is haggling over the topic of disagreement. In my limited reading of ‘equal weight/conciliatory views and their detractors’, I take the detractors to be suggesting something like “one is ‘within one’s rights’ to be steadfast”, rather than something like “you’re more accurate if you’re steadfast”. Maybe there are epistemic virtues which aren’t the same as being more accurate. Yet there may be less disagreement on ‘conditional on an accuracy first view, is modesty the right approach?’
This only gets so far (after all, shouldn’t we be modest whether only to care about accuracy?) A more general defence is this: the ‘what if you apply the theory to itself?’ problem looks pretty pervasive across theories.[13] Accounts of moral uncertainty that in whatever sense involve weighing normative theories by their plausibility tend to run into problems if the same accounts are applied ‘one level up’ to meta-moral uncertainty. Bayesian accounts of epistemology seem to go haywire if we think one should have a credence in Bayesian epistemology itself, especially if one assigns any non-zero credence on any theory which entails object level credences have undefined values.
Closer to home, milder versions of conciliation (e.g. “Pay some attention to peer disagreement, but it’s not the only factor”) share a similarly troublesome recursive loop (“Well, I see most other people are steadfast, so I should update to be a bit less conciliatory, but now I have to apply my modified view to this disagreement again”) and neat convergence is not guaranteed. The theories which avoid this problem (e.g. ‘Wholly steadfast, so peer disagreement should be ignored’), tend to be the least plausible on the object level (e.g. That if you believe bins are on Thursday, the fact all your neighbours have their bins out on Tuesday is not even reason to reconsider your belief).
A solution to these types of problems remains elusive. Yet modesty finds itself in fairly good company. It may be the case that a good resolution to this type of issue would rule out the strong form of modesty advocated here, in favour of some intermediate view. Until then, I hope the (admittedly inelegant) “Be modest, save for meta-epistemic norms about modesty itself” is not too great a cost to bear across the scales from the merits of the approach.
In practice
I take most of the action to surround whether modesty makes sense as a practical procedure in the real world, even granting it’s ‘in theory’ virtue. Given the strength of modesty, I advocate, the fact we use something like it in some cases, and we can identify it can help in others, is not enough. It needs to be shown as a better strategy than even slightly weaker forms, in circumstances deliberately selected to pose the greatest challenge to strong modesty.
Trivial (and less trivial) non-use cases
For some topics there’s no relevant epistemic peers or superiors to consider. This is commonly the case with pretty trivial beliefs (e.g. my desk is yellow).
Modesty also doesn’t help much for individual tastes, idiosyncrasies, or circumstances. If Adam works best listening to Bach and Beatrice to Beethoven, they probably won’t do better ‘meeting in the middle’ and both going half-and-half for each (or maybe picking a composer intermediate in history, like Mozart). Anyway, Adam is probably Beatrice’s significant epistemic superior on “What music does Adam work best listening to?”, and vice-versa. One can also be credulous of claims like “It turned out this diet really helped my back pain”: perhaps it’s placebo, or perhaps it is one of those cases where different things work for different people, and one expects in such cases individuals to have privileged access to what worked for them.[14]
There will be cases where one really is plowing a lonely furrow where there aren’t any close epistemic peers or superiors. It’s possible I really am the world’s leading expert on “How many counter-factual DALYs does a doctor avert during their career?”, because no one else has really looked into this question. My current role involves investigating global catastrophic biological risks, which appears understudied to the point of being pre-paradigmatic.
These comprise a very small minority of topics I have credences about. Yet even here modesty can help. One can use more distant bodies of experts: I am reassured that my autumnal estimate for the ‘DALY question’ coheres with expert consensus that medical practice had a minor role in improvements to human health, for example. Even if I don’t have any epistemic peers, I can simulate some by asking, “If there were lots of people as or more reasonable than me looking at this, would I expect them to agree with my take?” Given that the econometric-esque methods I deploy to the answer the ‘DALY question’ could probably be done better by an expert, and in any case reasonable people are often sceptical of these in other areas, I am less confident of my findings than my ‘inside view’ suggests, which I take to be a welcome corrective to ‘pet argument’ biases.[15]
In theory, the world should be mad
Whether devoured by Moloch, burned by Ra, trapped by aberrant signalling equilibria, or whatever else, we can expect to predict when apparent expert classes (and apparent epistemic peers) are going to collectively go wrong. With this knowledge, we can know which topics we should expect to ourselves to outperform expertise. Rather than the scenario where we commonly find ourselves looking up (at experts) or around (at our peers), we find ourselves in many situations where those who are usually epistemic peers or superiors are below us—and above us, only sky.
We could distinguish two sorts of madness, a surprising absence of expertise and a surprising error of expertise:
The former is a gap in the epistemic market. Although an important topic should be combed over by a body of experts, for whatever reason it isn’t, and so it takes surprisingly little effort to climb to the summit of epistemic superiority. In such cases our summaries of expert classes as ranging over a broad area conceal the degree of expertise is very patchy: public health experts generally know a great deal about the health impacts of smoking; they usually know much less about the health impacts of nicotine.
The latter is a stronger debunking argument. One appeals to some features of the world that generates expertise and suggests that these expertise generating features are anti-correlated to the truth, thus one can adjudicate between warring expert camps (or just indict all so-called ‘experts’) based on this knowledge. One strong predictor of incompatibilism regarding free will among philosophers is believing in God. If we are confident these beliefs in God are irrational, then we can winnow the expert class by this consideration and side with the compatibilist camp much more strongly.
Yet, similar to the problems of debunking mentioned earlier, that there is a good story suggesting one of these things does not imply one will do better ‘striking out on one’s own’. Even in cases of disease where accuracy is poorly correlated to expert activity, it is hard to think of cases where these line up orthogonal or worse. Big pharma studies are infamous, but even if you’re in big pharma optimising for ‘can I get evidence to support my product’, your drug actually working does make this easier. Even in pre-replication crisis psychology, true results would be overrepresented versus false ones in the literature compared to some base rate across generated hypotheses.
The ‘residual’ expert class still often remains better. Although most public health experts know little about nicotine per se, there are some nearby health experts, perhaps scattered across our common-sense demarcation of fields, who do know about the impacts of nicotine. It may still take quite a lot of effort to reach parity or superiority to these. Even if we want to strike all theists from free will philosophers, compatibilism does not rise close to unanimity, and so cautions against extremely high confidence this is the correct view.[16] So, I aver, the world is not that mad.
Empirically, the world is mad
One can offer a more direct demonstration of world-madness, and so refute modesty: outperformance.
A common reply is to point to a particular case where those being modest would have gotten it wrong. There are lots of cases where amateurs and mavericks were ridiculed by common sense or experts-at-the-time, only to be subsequently vindicated.
Another problem is the modest view introduces a lag—it seems one often needs to wait for the new information to take root among one’s epistemic peers before changing one’s view, whilst a cogniser just relying on the object level updates on correct arguments ‘at first sight’. It is often crucially important to be fast as well as right in both empirical and moral matters: it is extremely costly if a view makes one slower to recognise (among many other past moral catastrophes) the horror of slavery.
Yet modesty need not infallible, merely an improvement. Citing cases where it goes poorly is (hopefully less than) half the story. Modesty does worse in cases the maverick is right, yet better where the maverick is wrong: there are more cases of the latter than the former. Modesty does worse in being sluggish in responding to moral revolutions, yet better at avoiding being swept away by waves of mistaken sentiment: again, the latter seem more common than the former.[17]
Maybe one can follow a strategy such that you can ‘pick the hits’ of when to carve out exceptions, and so have a superior track record. Yet, empirically, I don’t see it. When I look at people who are touted as particularly good at being ‘correct contrarians’, I see at best something like an ‘epistemic venture capitalist’ - their bold contrarian guesses are right more often than chance, but not right more often than not. They appear by my lights to be unable to judiciously ‘pick their battles’, staking out radical views in topics where there isn’t a good story as to why the experts would be getting this wrong (still less why they’re more likely to get it right). So although they do get big wins, the modal outcome of their contrarian take is a bust.[18]
Modesty should price in the views of better-than-chance contrarians into how it weighs consensus. Confidence in a consensus view should fall if a good contrarian takes aim at it, but not so much one now takes the contrarian view oneself. If one happens to be a particularly successful contrarian one should follow the same approach: “I get these right surprisingly often, but I’m still wrong more often than not, so it might be worth it to look into this further to see if I can strike gold, but until then I should bank on the consensus view.”
Expert groups are seldom in reflective equilibrium
Even if modesty works well in the ideal case of a clearly identified ‘expert class’, it can get a lot messier in reality:
Suppose one is in the early 1940s and asks, “Is there going to be explosives with many orders of magnitude more power than current explosives?” One can imagine if one consulted explosive experts (however we cash that out), their consensus would generally say ‘no’. If one was able to talk to the physicists working on the Manhattan project, they would say ‘yes’. Which one should an outside view believe?[19]
Most people believe god exists (the so called ‘common consent argument for God’s existence’); if one looks at potential expert classes (e.g. philosophers, people who are more intelligent), most of them are Atheists. Yet if one looks at philosophers of religion (who spend a lot of time on arguments for or against God’s existence), most of them are Theists—but maybe there’s a gradient within them too. Which group, exactly, should be weighed most heavily?
So constructing the ideal ‘weighted consensus’ modesty recommends deferring to can become a pretty involved procedure. One must carefully divine whether a given topic lies closer to the magisterium of one or another putative expert class (e.g. maybe one should lean more to the physicists, as the question is really more ‘about physics’ than ‘about explosives’). One might have to carefully weigh up the relevant epistemic virtues of various expert classes that appear far from reflective equilibrium from one another (so perhaps one might use likely selection effect of philosophy of religion party discount the apparent support this provides). One might have to delve into complicated issues of independence: although most people may believe god exists, unlike guesses of how many skittles are in the jar, they are not all forming this belief independently from one another.[20]
This exercise begins to look increasingly insider-view-esque. Trying to determine the right magisterium involves getting closer to object level considerations about ‘aboutness’ of topics; trying to tease apart issues of independence and selection amount to looking at belief forming practices, and veer close to object level justifications for the belief in question. At some point it becomes extraordinarily challenging to try and back-trace from all these factors to the likely position of the ideal observer: the degrees of freedom these considerations invite (and the challenge in estimating them reliably) make strong modesty go worse.
One should not give up too early, though: modesty can still work pretty well even in these tricky cases. One can ask whether there’s any communication between the classes, and if so any direction of travel (e.g. did some explosive experts end up talking to the physicists, and agreeing they were right? Vice-versa?), even if they were completely isolated, one can ask if a third group having access to both made a decision (e.g. the agreement of the U.S. and German governments with the implied view of the physicists). This is a lot more involved, but the expected ‘accuracy yield per unit time spent’ may still be greater than (for example) making a careful study of the relevant physics.
A broader modification would be ‘immodest only for the web of belief, but modest for the weights’: one uses an inside view to piece together the graph of considerations around P, but one still defers to consensus on the weights. This may avoid cases where (for example) strong modesty may mistake astronomers as the expert class for about space travel being infeasible (versus primordial rocket scientists), even though astronomers and rocket scientists agreed about the necessary acceleration, but astronomers were inexpert on the key question as to whether that explanation could be produced.[21]
What if one cannot even do that? Then modestly (rightly) offers a counsel of despair. If an area is so fractious there’s no agreement, with no way to see which of numerous of disparate camps have better access the truth of the matter; so suffused with bias that even those with apparent epistemic virtues (e.g. judgement, intelligence, subject-matter knowledge) cannot be seen to even tend towards the truth; what hope does one have to do better than they? In attempting to thread the needle through these hazards towards the right judgement, one will almost certainly run aground somewhere or somehow, alike all one’s epistemic peers or superiors who made the attempt before. Perhaps reality obliges us to undertake these doxastic suicide missions from time to time. If modesty cannot help us, it can at least provide the solace of a pre-emptive funeral, rather than (as immodest views would) cheer us on to our almost certain demise.
Somewhat satisfying Shulman
Carl Shulman encourages me to offer my credences and rationale in cases he takes to be particularly difficult for my view, and suggests in these cases I either arrive at absurd credences or I am covertly abandoning the strong modesty approach. I offer these below for readers to decide—with the rider that if these are in fact absurd, ‘I’m an idiot’ is a competing explanation to ‘strong modesty is a bad epistemic practice’ (and that, assuredly, whatever one’s credence on the latter, one’s credence in the former should be far greater).
Mostly discount common consent (non-independence) and PoR (selection). Major hits from more intelligent people/ better informed tend to be atheist, but struggle to extrapolate this closer to 0 given existence proofs of very epistemically virtuous religious people.
Libertarian free will
0.1
Commands a non-trivial minority across virtuous epistemic classes (philosophers, intelligent people, etc), only somewhat degraded by selection worries.
Jesus rose from the dead
0.005
Christianity in particular a very small fraction of possibility space of Theism. Support from its widespread support is mostly (but not wholly) screened off by non-independence effects. Relevant (but distant) expert classes in history etc. weigh adversely.
There has been a case of cold fusion
10^-5
Strong pan scientific consensus against, cold fusion community looks renegade and much less epistemically virtuous. Base rate of these conditional on no effect gives very adverse reference class.
ESP
10^-6
Very strong (but non-complete) trophism among elite common sense, scientists, etc; bad predictive track records for ESP researchers; distant consensuses highly adverse. Some greatly attenuated boost from survey data/small fraction of reasonable believers.
Practical challenges to modesty
Modesty can lead to double-counting, or even groupthink. Suppose in the original example Beatrice does what I suggest and revise their credences to be 0.6, but Adam doesn’t. Now Charlie forms his own view (say 0.4 as well) and does the same procedure as Beatrice, so Charlie now holds a credence of 0.6 as well. The average should be lower: (0.8+0.4+0.4)/3, not (0.8+0.6+0.4)/3, but the results are distorted by using one-and-a-half helpings of Adam’s credence. With larger cases one can imagine people wrongly deferring to hold consensus around a view they should think is implausible, and in general the nigh-intractable challenge from trying to infer cases of double counting from the patterns of ‘all things considered’ evidence.
One can rectify this by distinguishing ‘credence by my lights’ versus ‘credence all things considered’. So one can say “Well, by my lights the credence of P is 0.8, but my actual credence is 0.6, once I account for the views of my epistemic peers etc.” Ironically, one’s personal ‘inside view’ of the evidence is usually the most helpful credence to publicly report (as it helps others modestly aggregate), whilst ones all things considered modest view usually for private consumption.
Community benefits to immodesty
Modesty could be parasitic on a community level. If one is modest, one need never trouble oneself with any ‘object level’ considerations at all, and simply cultivate the appropriate weighting of consensuses to defer to. If everyone free-rode like that, no one would discover any new evidence, have any new ideas, and so collectively stagnate.[23] Progress only happens if people get their hands dirty on the object-level matters of the world, try to build models, and make some guesses—sometimes the experts have gotten it wrong, and one won’t ever find that out by deferring to them based on the fact they usually get it right.[24]
The distinction between ‘credence by my lights’ versus ‘credence all things considered’ allows the best of both worlds. One can say ‘by my lights, P’s credence is X’ yet at the same time ‘all things considered though, I take P’s credence to be Y’. One can form one’s own model of P, think the experts are wrong about P, and marshall evidence and arguments for why you are right and they are wrong; yet soberly realise that the chances are you are more likely mistaken; yet also think this effort is nonetheless valuable because even if one is most likely heading down a dead-end, the corporate efforts of people like you promises a good chance of someone finding a better path.
In macro, it’s important for people like me to always search for the truth, and reach conclusions about economic models in a way that is independent of the consensus model. In that way, I play my “worker ant” role of nudging the profession towards a greater truth. But at the same time we need to recognize that there is nothing special about our view. If we are made dictator, we should implement the consensus view of optimal policy, not our own. People have trouble with this, as it implies two levels of belief about what is true. The view from inside our mind, and the view from 20,000 miles out in space, where I see there is no objective reason to favor my view over Krugman’s.
Despite this example, maybe it is the case that ‘having a creative brain which makes big discoveries’ is anticorrelated to ‘having a sober brain well-calibrated to its limitations compared to others’: anecdotally, eccentric views among geniuses are common. Maybe for most it isn’t psychologically tenable to spend one’s life investigating a renegade view one thinks ultimately is likely a dead-end, and in fact people do groundbreaking research generally have to be overconfident to do the best science. If so, we should act communally to moderate this cost, but not celebrate it as a feature.
Not everyone has to do be working on discovering new information. One could imagine a symbiosis between eccentric overconfident geniuses whose epistemic comparative advantage is to who gambol around idea-space to find new considerations, and well-calibrated thoughtful people whose comparative advantage is in soberly weighing considerations to arrive at a well callibrated all-things-considered view.
Conclusion: a pean, and a plea
I have argued above for a strong approach to modesty, one which implies—at least in terms of ‘all things considered view’ - one’s view of the object level merits counts for very little. Even if I am mistaken about the ideal strength of modesty, I am highly confident both the EA and rationalist communities err in the ‘insufficiently modest’ direction. I close on these remarks.
Rationalist/EA exceptionalism
Both communities endure a steady ostinato of complaints about arrogance. They’ve got a point. I despair of seeing some wannabe-iconoclast spout off about how obviously the solution to some famously recondite issue is X and the supposed experts who disagree obviously just need to better understand the ‘tenets of EA’ or the sequences. I become lachrymose when further discussion demonstrates said iconoclast has a shaky grasp of the basics, that they are recapitulating points already better-discussed in the literature, and so forth.[25]
To stress (and to pre-empt), the problem is not, “You aren’t kowtowing appropriately to social status!” The problem is considerable over-confidence married with inadequate understanding. This both looks bad to outsiders,[26] but it also is bad as the individual (and the community itself) could get to the truth faster if they were more modest about their likely position in the distribution of knowledge about X, and then did commonsensical things to increase it.
Consider Gell-Mann amnesia (via Michael Crichton):
You open the newspaper to an article on some subject you know well. In Murray’s case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the “wet streets cause rain” stories. Paper’s full of them.
In any case, you read with exasperation or amusement the multiple errors in a story, and then turn the page to national or international affairs, and read as if the rest of the newspaper was somehow more accurate about Palestine than the baloney you just read. You turn the page, and forget what you know.
Gell-Mann cases invite inferring adverse judgements based on extrapolating from in instance of poor performance. When experts in multiple different subjects say the same thing (i.e. Murray and Crichton chatted to an expert on Palestine who had the same impression), this adverse inference gets all the stronger.
Some, perhaps many, pieces of work or corporate projects in our community share this property: it might look good or groundbreaking to us as relatively less-informed, domain experts in the fields it touches upon tend to report it is misguided or rudimentary. Although it is possible to indict all these judgements, akin to a person who gives very adverse accounts of all of their previous romantic partners, we may start to wonder about a common factor explanation. Our collective ego is writing checks our epistemic performance (or, in candour, performance generally) cannot cash; general ignorance, rather than particular knowledge, may explain our self-regard.
To discover, not summarise
It is thought that to make the world go better new things need to be discovered, above and beyond making sound judgements on existing knowledge. Quickly making accurate determinations of the balance of reason for a given issue is greatly valuable for the latter, but not so much for the former.
Yet the two should not be confused. If one writes a short overview of a subject ‘for internal consumption’ which gives a fairly good impression of what a particular view should be, one should not be too worried if a specialist complains that you haven’t covered all the topics as adequately as one might. However, if one is aiming to write something which articulates an insight or understanding not just novel to the community, but novel to the world, one should be extremely concerned if domain experts review this work and say things along the lines of, “Well, this is sort of a potted recapitulation of work in our field, and this insight is widely discussed”.
Yet I see this happen a lot to things we tout as ‘breakthrough discoveries’. We want to avoid case where we waste our time in unwitting recapitulation, or fail to catch elementary mistakes. Yet too often we license ourselves to pronounce these discoveries without sufficient modesty in cases where there’s already a large expert community working on similar matters. This does not preclude these discoveries, but it cautions us to carefully check first. On occasions where I take myself to have a new insight in areas outside my field (most often philosophy), I am extremely suspect of my supposed discovery: all too often would this arise from my misunderstanding, or already be in the literature somewhere I haven’t looked. I carefully consult the literature as best as I can, and run the idea by true domain experts, to rule out these possibilities.[27]
Others seem to lack this modesty, and so predictably err. More generally, a more modest view of ‘intra-community versus outside competence’ may also avoid cases of having to reinvent the wheel (e.g. that scoring rule you spent six months deriving for a karma system is in this canonical paper), or for an effort to derail (e.g. oh drat, our evaluation provides worthless data because of reasons we could have known from googling ‘study design’).
Paradoxically pathological modesty
If the EA and rationalist communities comprised a bunch of highly overconfident and eccentric people buzzing around bumping their pet theories together, I may worry about overall judgement and how much novel work gets done, but I would at grant this at least looks like fertile ground for new ideas to be developed.
Alas, not so much. What occurs instead is agreement approaching fawning obeisance to a small set of people the community anoints as ‘thought leaders’, and so centralizing on one particular eccentric and overconfident view.[28] So although we may preach immodesty on behalf of the wider community, our practice within it is much more deferential.
I hope a better understanding of modesty can get us out of this ‘worst of both worlds’ scenario. It can at least provide better ‘gurus’ to defer to. Better, modesty also helps to correct two mistaken impressions: one, overly wide gap between our gurus and other experts; two, the overly narrow gap between ‘intelligent layperson in the community’ and ‘someone able to contribute to the state of the art’. Some topics are really hard: being able to become someone with ‘something useful to say’ about these not take days but take years; there are many deep problems we must concern ourselves with; that the few we select as champions, despite their virtue, cannot do them all alone; and that we need all the outside help we can get.
Coda
What the EA community mainly has now is a briar-patch of dilettantes: each ranges widely, but with shallow roots, forming whorls around others where it deems it can find support. What it needs is a forest of experts: each spreading not so widely; forming a deeper foundation and gathering more resources from the common ground; standing apart yet taller, and in concert producing a verdant canopy.[29] I hope this transformation occurs, and aver modesty may help effect it.
Acknowledgements
I thank Joseph Carlsmith, Owen Cotton-Barratt, Eric Drexler, Ben Garfinkel, Roxanne Heston, Will MacAskill, Ben Pace, Stefan Schubert, Carl Shulman, and Pablo Stafforini for their helpful discussion, remarks, and criticism. Their kind help does not imply their agreement. The errors remain my own.
[Edit 30/10: Rewording and other corrections—thanks to Claire Zabel and Robert Wiblin]
[1] Much of this follows discussion in the social epistemology literature about conciliationism, or the ‘equal weight view’. See here for a summary
[2] They also argue at length about the appropriate weight each of these considerations should have on the scales of judgement. I suggest (although this is not necessary for this argument) that in many cases most of the action lies in judging the ‘power’ of evidence. In most cases I observe people agree that a given consideration C influences the credence one holds in P; they usually also agree in its qualitative direction; the challenge comes in trying to weigh each consideration against the others, to see which considerations one’s credence over P should pay the greatest attention to.
This may represent a general feature of webs of belief being dense and many-many (A given credence is influenced by many other considerations, and forms a consideration for many credences in turn), or it may simply be a particular feature of webs of belief in which humans perform poorly: although I am confident I can determine the sign of a particular consideration, I generally don’t back myself to hold credences (or likelihood ratios) to much greater precision than the first significant digit, and I (and, perhaps, others) struggle in cases where large numbers of considerations point in both directions.
[3] In the literature this is called ‘straight averaging’. For a variety of technical reasons this doesn’t quite work as a peer update rule. That said, given things like bayesian aggregation remain somewhat open problems, I hope readers will accept my promissory note that there will be a more precise account which will produce effectively the same results (maybe ‘approximately splitting the difference’) through the same motivation.
[4] C.f. Aumann’s agreement theorem. As an aside (which I owe to Carl Shulman), straight averaging will not work in some degenerate cases where (similar to ‘common knowledge puzzles’) one can infer precise observations from the probabilities stated. The neatest example I can find comes from Hal Finney (see also):
Suppose two coins are flipped out of sight, and you and another person are trying to estimate the probability that both are heads. You are told what the first coin is, and the other person is told what the second coin is. You both report your observations to each other.
Let’s suppose that they did in fact fall both heads. You are told that the first coin is heads, and you report the probability of both heads as 1⁄2. The other person is told that the second coin is heads, and he also reports the probability as 1⁄2. However, you can now both conclude that the probability is 1, because if either of you had been told that the coin was tails, he would have reported a probability of zero. So in this case, both of you update your information away from the estimate provided by the other.
[5] To motivate: Adam and Beatrice no longer know whether or not reasons they hold for or against P are private evidence or not. Yet (given epistemic peerhood), they have no principled reason to suppose “I know something that they don’t” is more plausible than the opposite. So again they should be symmetrical.
[6] (On which more later) it is worth making clear that the possibility of bias for either Adam or Beatrice doesn’t change the winning strategy on expectation. Say Adam’s credence for P is in fact biased upwards by 0.4. If Adam knows this, he can adjust and become unbiased, if Oliver or Beatrice knows this (and knows Adam doesn’t), the break the peerhood for Adam but can simulate unbiased Adam* which would remain a peer, and act accordingly. If none of them know this, then it is the case that Beatrice wins, as does Oliver following a non-averaging ‘go with Beatrice’ strategy. Yet this is simply epistemic luck: without information, all reasonable prior distribution candidates of (Adam’s bias—Beatrice’s bias) are symmetrical about 0.
[7] Another benefit of modesty is speed: Although it is the case Adam and Beatrice’s credence (and thus the average) gets more accurate if they have time to discuss it, and so catch one another if they make a mistake or reveal previously-private evidence, averaging is faster and the trade-off in time for better precision may not be worth it. It still remains the case, as per the first example, that they still do better, after this discussion, if they meet in the middle on residual disagreement.
[8] A further (albeit minor and technical) dividend is that although individual guesses may form any distribution (for which the standard deviation may not be a helpful summary), the central limit theorem applies to the average of guesses distribution, so it tends to normality.
[9] Even if one is the world authority, there should be some deference to lesser experts. In cases where the world expert is an outlier, one needs to weigh up numbers versus (relative) epistemic superiority to find the appropriate middle.
[10] God from the Mount of Sinai, whose gray top Shall tremble, he descending, will himself In Thunder Lightning and loud Trumpets sound Ordaine them Lawes…
Milton, Paradise Lost
[11] I take the general pattern that strong modesty usually immures one from common biases is a further point in its favour.
[13] A related philosophical defence would point out that the self-undermining objection would only apply to whether one should believe modesty, not whether modesty is in fact true.
[14] I naturally get much more sceptical if that person then generalises from this N=1 uncontrolled unblinded crossover trial to others, or takes it as lending significant support against some particular expert consensus or expertise more broadly: “Doctors don’t know anything about back pain! They did all this rubbish but I found out all anyone needs to do is cut carbs!”
[15] It also provokes fear and trembling in my pre-paradigmatic day job, given I don’t want the area to have strong founder effects which poorly track the truth.
One of the easiest hard questions, as millennia-old philosophical dilemmas go. Though this impossible question is fully and completely dissolved on Less Wrong, aspiring reductionists should try to solve it on their own.
[17] Aside: A related consideration is ‘optimal damping’ of credences, which is closely related to resilience. Very volatile credences may represent the buffeting of a degree of belief by evidence large relative to one’s prior—but it may also represent poor calibration in overweighing new evidence (and vice versa). The ‘ideal’ response in terms of accuracy is given by standard theory. Yet it is also worth noting that’s one prudential reasons may want to introduce further lag or lead, akin to the ‘D’ or ‘I’ components of a PID controller. In large irreversible decisions (e.g. career choice) it may be better to wait a while after one’s credences support a change to change action; for case of new moral consideration it may be better to act ‘in advance’ for precautionary principle-esque reasons.
[18] (Owed to Will MacAskill) There’s also a selection effect: of a sample of ‘accurate contrarians’, many of these may be lucky rather than good.
[19] I owe this particular example to Eric Drexler, but similar counter-examples along these lines to Carl Shulman.
[20] Another general worry is these difficult-to-divine considerations offer plenty of fudge factors—both to make modesty get the ‘right answer’ in historical cases, and to fudge present areas of uncertainty to get results that accord with one’s prior judgement.
[21] I owe both this modification and example to discussions with Eric Drexler. There are some costs—one may think there are cases one should defer to an outside view on the web of belief (E.g. Christian apologist: “Sure, I agree with scientific consensus that it’s improbable Jesus rose naturally from the dead, but the key argument is whether Jesus rose supernaturally from the dead. So the consensus for philosophers of religion is the right expert class.”) The balance of merit overall is hard to say, but such a modification still looks like pretty strong modesty.
[22] In conversation I recall a suggestion by Shulman such a credence should change one’s behaviour regarding EA—maybe one should do theology research in the hope of finding a way to extract infinite value etc. Yet the expert class for action|Theism gives a highly adverse prior: virtually no actual theists (regardless of theological expertise, within or outside EA) advocate this.
[23] I understand a similar point is raised in economics regarding the EMH and the success of index funds. Someone has to do the price discovery.
[25] For obvious reasons I’m reluctant to cite specific examples. I can offer some key words for the sort of topics I see this problem as endemic: Many-worlds, population ethics, free will, p-zombies, macroeconomics, meta-ethics.
[26]C.f. Augustine, On the Literal Meaning of Genesis:
Usually, even a non-Christian knows something about the earth, the heavens, and the other elements of this world, about the motion and orbit of the stars and even their size and relative positions, about the predictable eclipses of the sun and moon, the cycles of the years and the seasons, about the kinds of animals, shrubs, stones, and so forth, and this knowledge he hold to as being certain from reason and experience. Now, it is a disgraceful and dangerous thing for an infidel to hear a Christian, presumably giving the meaning of Holy Scripture, talking nonsense on these topics; and we should take all means to prevent such an embarrassing situation, in which people show up vast ignorance in a Christian and laugh it to scorn.
[27] I’m uncommonly fortunate that for me such domain experts are both nearby and generous with their attention. Yet this obstacle is not insurmountable. An idea (which I owe to Pablo Stafforini) is that a contrarian and a sceptic of the contrarian view could bet on whether a given expert, on exposure to the contrarian view, would change their mind as the contrarian predicts. S may bet with C: “We’ll pay some expert $X to read your work explicating your view, if they change their mind significantly in favour (however we cash this out) I’ll pay the $X, if not, you pay the $X.
[29] Perhaps unsurprisingly, I would use a more modest ecological metaphor in my own case. In reclaiming extremely inhospitable environments, the initial pioneer organisms die rapidly. Yet their corpses sustain detritivores, and little by little, an initial ecosystem emerges to be succeeded by others. In a similar way, I hope that the detritus I provide will, after a fashion (and a while), become the compost in which an oak tree grows.
In defence of epistemic modesty
This piece defends a strong form of epistemic modesty: that, in most cases, one should pay scarcely any attention to what you find the most persuasive view on an issue, hewing instead to an idealized consensus of experts. I start by better pinning down exactly what is meant by ‘epistemic modesty’, go on to offer a variety of reasons that motivate it, and reply to some common objections. Along the way, I show common traps people being inappropriately modest fall into. I conclude that modesty is a superior epistemic strategy, and ought to be more widely used—particularly in the EA/rationalist communities.
[gdoc]
Provocation
I argue for this:
In virtually all cases, the credence you hold for any given belief should be dominated by the balance of credences held by your epistemic peers and superiors. One’s own convictions should weigh no more heavily in the balance than that of one other epistemic peer.
Introductions and clarifications
A favourable motivating case
Suppose your mother thinks she can make some easy money day trading blue-chip stocks, and plans to kick off tomorrow shorting Google on the stock market, as they’re sure it’s headed for a crash. You might want to dissuade her in a variety of ways.
You might appeal to an outside view:
Or appeal to some reference class:
Or just cite disagreement:
Instead of directly challenging the object level claim (i.e. “Google isn’t overvalued, because X”). These considerations attempt to situate the cogniser within some population, and from characteristics of this population infer the likelihood of this cogniser getting things right.
Call the practice of using these techniques considerations epistemic modesty. We can distinguish two components:
‘In theory’ modesty: That considerations of this type should in principle influence our credences.
‘In practice’ modesty: That one should in fact use these considerations when forming credences.
Weaker and stronger forms of modesty
Some degree of modesty is (almost) inarguable. If one leaves for work on Tuesday and finds all your neighbours left their bins out, that’s at least reason to doubt your belief bins were on Thursday, and perhaps sufficient to believe instead bins are on Tuesday (and follow suit with your bins). If it appears that, say, the coagulation cascade ‘couldn’t evolve’, the near unanimity of assent for evolution among biologists at least counts against this, if not a decisive reason, despite one’s impressions, that it could. Nick Beckstead suggests something like ‘elite common sense’ forms a prior which one should be hesitant to diverge from without good reason.
I argue for something much stronger (c.f. the Provocation above): in theory, one’s credence in some proposition P should be almost wholly informed by modest considerations. That, ceteris paribus, the fact it appears to you that P should weigh no more heavily in one’s determination regarding P than knowing that it appears to someone else that P. Not only is this the case in theory, but it is also the case in practice. One’s all things considered judgement on P should be just that implied by an idealized expert consensus on P, no matter one’s own convictions regarding P.
Motivations for more modesty
Why believe ‘strong form’ epistemic modesty? I first show families of cases where ‘strong modesty’ leads to predictably better performance, and show these results generalise widely.[1]
The symmetry case
Suppose Adam and Beatrice are perfect epistemic peers, equal in all respects which could bear on them forming more or less accurate beliefs. They disagree on a particular proposition P (say “This tree is an Oak tree”). They argue about this at length, such that all considerations Adam takes to favour “This is an Oak tree” are known to Beatrice, and vice versa.[2] After this, they still disagree: Adam has a credence of 0.8, Beatrice 0.4.
Suppose an outside party (call him Oliver) is asked for his credence of P, given Adam and Beatrice’s credences and their epistemic peer-hood to one another, but bereft of any object-level knowledge. He should split the difference between Adam and Beatrice − 0.6: Oliver doesn’t have any reason to favour Adam over Beatrice’s credence for P as they are epistemic peers, and so splitting the difference gives the least expected error.[3] If he was faced with a large class of similar situations (maybe Adam and Beatrice get into the same argument for Tree 2 to Tree 10,000) Oliver would find that difference splitting has lower error than biasing to either Adam or Beatrice’s credence.
Adam and Beatrice should do likewise. They also know they are epistemic peers, and so they should also know that for whatever considerations explain their difference (perhaps Adam is really persuaded by the leaf shapes, but Beatrice isn’t) Adam’s take and Beatrice’s take are no more likely to be right than one another. So Adam should go (and Beatrice vice-versa), “I don’t understand why Beatrice isn’t persuaded by the leaf shapes, but she expresses the same about why I find it so convincing. Given she is my epistemic peer, ‘She’s not getting it’, and, ‘I’m not getting it’ are equally likely. So we should meet in the middle”.
The underlying intuition is one of symmetry. Adam and Beatrice have the same information. The correct credence regarding P given this information should not depend on which brain Adam or Beatrice happens to inhabit. Given this, they should hold the same credence[4], and as they Adam is as likely to be further from the truth than Beatrice, the shared credence should be in the middle.
Compressed sensing of (and not double-counting) the object level
It seems odd that both Adam and Beatrice do better discarding their object level considerations regarding P. If we adjust the scenario above so they cannot discuss with one another but are merely informed of each other’s credences (and that they are peers regarding P), the right strategy remains to meet in the middle.[5] Yet how come Adam and Beatrice are doing better if they ignore relevant information? Both Adam and Beatrice have their ‘inside view’ evidence (i.e. what they take to bear on the credence of P) and the ‘outside view’ evidence (what each other think about P). Why not use a hybrid strategy which uses both?
Yet to whatever extent Adam or Beatrice’s hybrid approach leads them to diverge from equal weight, they will do worse. Oliver can use the ‘meet in the middle strategy’ to get an expectedly better accuracy than either biasing towards their own inside view determination. In betting terms, Oliver can arbitrage any difference in credence between Adam and Beatrice.
We can explain why: the credences Adam and Beatrice offer can be thought of as very compressed summaries of the considerations they take to bear upon P. Whatever ‘inside view’ considerations Adam took to bear upon P are already ‘priced in’ to the credence he reports (ditto Beatrice). Modesty is not ignoring this evidence, but weighing it appropriately: if Adam then tries to adjust the outside view determination by his own take on the balance of evidence, he double counts his inside view: once in itself, and once more by including his credence as weighing equally to Beatrice’s in giving the outside view.
One’s take on the set of considerations regarding P may err, either by bias,[6] ignorance, or ‘innocent’ mistake. Splitting the difference between you and your peer’s very high level summary of these captures the great fraction of benefit of hashing out where these summaries differ.[7] Modesty correctly diagnoses that one’s high level summary is no more likely to be more accurate than one’s peers, and so holds those in equal regard, even in cases where the components of one’s own summary are known better.
Repeated measures, brains as credence censors, and the wisdom of crowds
Modesty outperforms non-modesty in the n=2 case. The degree of outperformance grows (albeit concavely) as n increases.
Scientific fields often have to deal with unreliable measurement. They commonly mitigate this by having repeat measurement. If you have a crummy thermometer, repeating readings several times improves accuracy over just the once. Human brains also try and measure things, and they are also often unreliable. It is commonly observed that nonetheless the average of their measurement tends to lie closer to the mark than the vast majority of individual measurements. Consider the commonplace ‘guess how many skittles are in this jar’ or similar estimation games: the usual observation is that the average of all the guesses is better than all (or almost all) the individual guesses.
A toy model makes this unsurprising. The individual guesses will form some distribution centered on the true value. Thus the expected error of a given individual guess is the standard deviation of this distribution. The expected error of the average of all guesses is given by the standard error, which is the standard deviation divided by root(number of guesses):[8] with 10 individuals, the error is about 3 times smaller than the expected error of each individual guess; with 100, 10 times smaller; and so on.
Analogously, human brains also try to measure credences or degrees of belief, and are similarly imperfect to when they’re trying to estimate ‘number of X’. Yet one may expect a similar effect to this ‘wisdom of crowds’ to operate here too. In the same way Adam and Beatrice would do better in the situation above if they took the average (even if it went against their view of the balance of reasons by their lights), if Adam-to-Zabaleta (all epistemic peers) investigated the same P, they’d expect to do better if they took the average of their group versus steadfastly holding to the credence they arrived at ‘by their lights’. Whatever inaccuracies that may throw off their individual estimates of P somewhat cancel out.
Deferring to better brains
The arguments above apply to cases where one is an epistemic peer. If not, one needs to adjust by some measure of ‘epistemic virtue’. In cases where Adam is an epistemic superior to Beatrice, they should meet closer to Adam’s view, commensurate with the degree of epistemic superiority (and vice versa).
Although reasons for being an epistemic superior could be ‘they’re a superforecaster’ or ‘they’re smarter than I am’, perhaps the most common source of epistemic superiors lie under the heading of ‘subject matter expert’. On topics from human nutrition, to voting rules, to the impact of the minimum wage, to the nature of consciousness, to basically anything that isn’t trivial, one can usually find a fairly large group of very smart people who spend many years studying that topic, who make public their views about this topic (sometimes not even behind a paywall). That they at least have a much greater body of relevant information and have spent longer thinking about it gives them a large advantage compared to you.
In such cases, the analogy might be that your brain is a sundial, whilst theirs is an atomic clock. So if you have the option of taking their readings rather than yours, you should do so. The evidence a reading of a sundial provides about the time conditional on the atomic clock reading is effectively zero. ‘Splitting the difference’ in analagous epistemic cases should result with both you and your epistemic superior agreeing that they are right and you are wrong.
Inference to the ideal epistemic observer
We can summarise these motivations by analogy to ideal observers (used elsewhere in perception and ethical theory). We can gesture that an ideal (epistemic) observer is just that which is able to form the most accurate credence for P given whatever prior: we can explain they have vast intelligence, full knowledge of all matters that bear upon P, perfect judgement, and in essence all epistemic virtues in excelsis.
Now consider this helpful fiction:
The epistemic fall: Imagine a population solely comprised of ideal observers, who all share the same (correct) view on P. Overnight their epistemic virtues are assailed: they lose some of their reasoning capacity; they pick up particular biases that could throw them one way or another; they lose information, and so on, and each one to varying degrees.
They wake up to find they now have all sorts of different credences about P, and none of them can remember what credence they all held yesterday. What should they do?
It seems our fallen ideal observers can begin to piece together what their original credence was about P by finding out more about their credences and remaining epistemic virtue, and so backpropagate their return to epistemic apotheosis. If they find they’re all similarly virtuous and are evenly scattered, their best guess is the ideal observer was in the middle of the distribution (c.f. the wisdom of crowds). If they see a trend that those with greater residual virtue tend to hold a higher credence in P, they should attempt to extrapolate this trend to suggest the ideal agent origin from which they were differentially blown of course from. If they see one group demonstrates a bias that others do not, they can correct the position of this group before trying these procedures. If they find the more virtuous agents are more scattered regarding P, (or that they segregate into widely dispersed aggregations), this should make them very unsure about where the ideal observer initially was. And so on.
Such a model clarifies the benefit of modesty. Although we didn’t have some grand epistemic fall, it is clear we all fall manifestly short of an ideal observer. Yet we all fall short in different respects, and in different degrees. One should want to believe whatever one would believe if one was an ideal observer, shorn of one’s manifest epistemic vices. Purely immodest views must say their best guess is the ideal observer would think the same as they do, and hope that all the vicissitudes of their epistemic vice happen to cancel out. By accounting for the distribution of cognisers, modesty allows a much better forecast, and so a much more accurate belief. And the best such forecast is the strong form of modesty, where one’s particular datapoint, in and of itself, should not be counted higher than any other.
Excursus: Against common justifications for immodesty
So much for strong modesty in theory. How does it perform in practice?
One rough heuristic for strong modesty is this: for any question, find the plausible expert class to answer that question (e.g. if P is whether to raise the minimum wage, talk to economists). If this class converges on a particular answer, believe that answer too. If they do not agree, have little confidence in any answer. Do this no matter whether one’s impression of the object level considerations that recommend (by your lights) a particular answer.
Such a model captures all the common sense cases of modesty—trust the results in typical textbooks, defer to consensus in cases like when to put the bins out, and so on. I now show it is also better in many cases where people think it is better to be immodest.
Being ‘well informed’ (or even true expertise) is not enough
A common refrain is that one is entitled to ‘join issue’ with the experts due to one having made some non-trivial effort at improving one’s knowledge of the subject. “Sure, I accept experts widely disagree on macro-economics, but I’m confident in neo-Keynesianism after many months of careful study and reflection.”
This doesn’t fly by the symmetry argument above. Our outsider observes widespread disagreement in the area of macroeconomics, and that many experts who spend years on the subject nonetheless greatly disagree. Although it is possible the ideal observer would have been in one or another of the ‘camps’ (the clustering implies intermediate positions are less plausible), the outsider cannot adjudicate which one if we grant the economists in each appear to have similar levels of epistemic virtue. The balance of this outside view changes imperceptibly if another person who despite a few months of study remains nowhere near peerhood (let alone superiority) of these divided experts, happens to side with one camp or another. By symmetry, one’s own view of the balance of reason should remain unchanged if this ‘another person’ happened to be you.
The same applies even if you are a bona fide expert. Unless the distribution of expertise is such that there is a lone ‘world authority’ above all others (and you’re them) your fellow experts form your epistemic peer group. Taking the outside view is still the better bet: the consensus of experts tends to be right more often than dissenting experts, and so some difference splitting (weighed more to the consensus owing to their greater numbers) is the right answer.[9]
Common knowledge ‘silver bullet arguments’
Suppose one takes an introductory class in economics. From this, one sees there must be a ‘knock-down’ argument against a minimum wage:
Yet one quickly discovers economists seem to be deeply divided over the merits of the minimum wage (as they are about most other things). See for example this poll suggesting 38 economic experts in the US are pretty evenly divided on whether the minimum wage would ‘hit’ employment for low-skill workers, and leant in favour of the minimum wage ‘all things considered’.
It seems risible to suppose these economists don’t know their economics 101. What seems much more likely is that they know other things that you don’t which make the minimum wage more reasonable than your jejune understanding of the subject suggests. One need not belabour which side the outside view strongly prefers.
Yet it is depressingly common for people to confidently hold that view X or Y is decisively refuted by some point or another, notwithstanding the fact this point is well known to the group of experts that nonetheless hold X or Y. Of course in some cases one really has touched on the decisive point the experts have failed to appreciate. More often, one is proclaiming that one is on the wrong side of the Dunning-Kruger effect.
Debunking the expert class (but not you)
To the litany of cases where (apparent) experts screwed up, we can add verses without end. So we might be inclined to debunk a particular ‘expert consensus’ due to some bias or irrationality we can identify. Thus, having seen there are no ‘real’ experts to help us, we must look at the object level case.
The key question is this: “How are you better?” And it is here that debunking attempts often flounder:
An undercutting defeater for one aspect of epistemic superiority for the expert class is not good enough. Maybe one can show the expert class has a poor predictive track record in their field. Unless one has a better track record in their field, this puts you on a par with respect to this desideratum of epistemic virtue. They likely have others (e.g. more relevant object-level knowledge) that should still give them an edge, albeit attenuated.
An undercutting defeater that seems to apply equally well to oneself as the expert class also isn’t enough. Suppose (say) economics is riven by ideological bias: why are you less susceptible to these biases? The same ideological biases that might plague professional economists may also plague amateur economists, but the former retain other advantages.
Even if a proposed debunking is ‘selectively toxic’ to the experts versus you, it still might be your epistemic superior all things considered. Both Big Pharma and Professional Philosophy may be misaligned, but perhaps not so much to be orthogonal or antiparallel to the truth: in both they still expectedly benefit by finding drugs that work or making good arguments respectively. They may still fare better overall than, “Intelligent layperson who’s read extensively”, even if they are not subject to ‘publish or perish’ or similar.
Even if a proposed debunking shows one as decisively superior to that expert class, there may be another expert class which remains epistemically superior to you. Maybe you can persuasively show professional philosophers are so compromised on consciousness that they should not be deferred to about it. Then the real expert class may simply switch to something like ‘intelligent people outside the academy who think a lot about the topic’. If it’s the case that this group of people do not share your confidence in your view, it seems outsiders should still reject it—as should you.
It need not be said that the track record for these debunking defeaters is poor. Most crackpots have a persecution narrative to explain why the mainstream doesn’t recognise or understand them, and some of the most mordant criticisms of the medical establishment arise from those touting complementary medicine. Thus ‘explaining away’ expert disagreement may not put one in a more propitious reference class than one started from. One should be particularly suspicious of debunking(s) sufficiently general that the person holding the unorthodox view has no epistemic peers—they are akin to Moses, descending from Mt. Sinai, bringing down God-breathed truth for the rest of us.[10]
Private evidence and pet arguments
Suppose one thinks one is in receipt of a powerful piece of private evidence: maybe you’ve got new data or a new insight. So even though the experts are generally in the right, in this particular case they are wrong because they are unaware of this new consideration.
New knowledge will not spread instantaneously, and that someone can be ‘ahead of the curve’ comes as no surprise. Yet many people who take themselves to have private evidence are wrong: maybe experts know about it but don’t bother to discuss it because it is so weak, or it is already in the literature (but you haven’t seen it), or it isn’t actually relevant to the topic, or whatever else. Most mavericks who take themselves to have new evidence that overturns consensus are mistaken.
The natural risk is people tend to be too partial to their pet arguments or pet data, and so give them undue weight, and so one’s ‘insider’ perceptions should perhaps be attenuated by this fact. I suspect most are overconfident here.[11] If this private evidence really is powerful, one should expect it to be persuasive to members of this expert class once they become aware of it. So it seems the credence one should have is the (appropriately discounted) forecast of what the expert class would think once you provide them this evidence.
The natural test of the power of this private evidence is to make it public. If one observes experts (or just epistemic peers) shift to your view, you were right about how powerful this evidence was. If instead one sees a much more modest change in opinion, this should lead one to downgrade your estimate as to how powerful this evidence really is (and perhaps provide calibration data for next time). Holding instead this really is decisive evidence leads one to the problematic ‘common knowledge silver bullet’ case discussed above. Inferring from this experts just can’t understand your reasoning or are biased against outsiders or whatever else produces a suspiciously self-serving debunking argument, also discussed above.
Objections
So much for the case in favour. What about the case against? I divide objections into those ‘in theory’, and those ‘in practice’.
In theory
There’s no pure ‘outside view’[12]
It is not the case you can bootstrap an outside view from nothing. One needs to at least start with some considerations as to what makes one an epistemic peer or superior, and probably some minimal background knowledge of ‘aboutness’ to place topics under one or another expert class.
In the same way large amounts of our empirical information are now derived by instrument rather than direct application of our senses (but were ultimately germinated from direct sensory experience), large amounts of our epistemic information can be derived by deferring to better (or more) brains rather than using our own, even if this relies on some initial seed epistemology we have to realise for ourselves. This ‘germinal set of claims’ can still be modestly revised later.
Immodestly modest?
One line of attack from the social epistemology literature is that strong forms of modesty are self-defeating. If one is modest, one should assumedly be modest about ‘What is the right way to form beliefs if epistemic peers disagree with you?’ Yet one finds that very few people endorse the sort of epistemic modesty advocated above. When one looks among potential expert classes, such as more intelligent friends of mine (i.e. friends of mine), epistemologists, and so on, conciliatory views like these command only a minority. So the epistemically modest should vanish as they defer to the more steadfast consensus.
If so, so much the worse for modesty. I offer a couple of incomplete defences:
One is haggling over the topic of disagreement. In my limited reading of ‘equal weight/conciliatory views and their detractors’, I take the detractors to be suggesting something like “one is ‘within one’s rights’ to be steadfast”, rather than something like “you’re more accurate if you’re steadfast”. Maybe there are epistemic virtues which aren’t the same as being more accurate. Yet there may be less disagreement on ‘conditional on an accuracy first view, is modesty the right approach?’
This only gets so far (after all, shouldn’t we be modest whether only to care about accuracy?) A more general defence is this: the ‘what if you apply the theory to itself?’ problem looks pretty pervasive across theories.[13] Accounts of moral uncertainty that in whatever sense involve weighing normative theories by their plausibility tend to run into problems if the same accounts are applied ‘one level up’ to meta-moral uncertainty. Bayesian accounts of epistemology seem to go haywire if we think one should have a credence in Bayesian epistemology itself, especially if one assigns any non-zero credence on any theory which entails object level credences have undefined values.
Closer to home, milder versions of conciliation (e.g. “Pay some attention to peer disagreement, but it’s not the only factor”) share a similarly troublesome recursive loop (“Well, I see most other people are steadfast, so I should update to be a bit less conciliatory, but now I have to apply my modified view to this disagreement again”) and neat convergence is not guaranteed. The theories which avoid this problem (e.g. ‘Wholly steadfast, so peer disagreement should be ignored’), tend to be the least plausible on the object level (e.g. That if you believe bins are on Thursday, the fact all your neighbours have their bins out on Tuesday is not even reason to reconsider your belief).
A solution to these types of problems remains elusive. Yet modesty finds itself in fairly good company. It may be the case that a good resolution to this type of issue would rule out the strong form of modesty advocated here, in favour of some intermediate view. Until then, I hope the (admittedly inelegant) “Be modest, save for meta-epistemic norms about modesty itself” is not too great a cost to bear across the scales from the merits of the approach.
In practice
I take most of the action to surround whether modesty makes sense as a practical procedure in the real world, even granting it’s ‘in theory’ virtue. Given the strength of modesty, I advocate, the fact we use something like it in some cases, and we can identify it can help in others, is not enough. It needs to be shown as a better strategy than even slightly weaker forms, in circumstances deliberately selected to pose the greatest challenge to strong modesty.
Trivial (and less trivial) non-use cases
For some topics there’s no relevant epistemic peers or superiors to consider. This is commonly the case with pretty trivial beliefs (e.g. my desk is yellow).
Modesty also doesn’t help much for individual tastes, idiosyncrasies, or circumstances. If Adam works best listening to Bach and Beatrice to Beethoven, they probably won’t do better ‘meeting in the middle’ and both going half-and-half for each (or maybe picking a composer intermediate in history, like Mozart). Anyway, Adam is probably Beatrice’s significant epistemic superior on “What music does Adam work best listening to?”, and vice-versa. One can also be credulous of claims like “It turned out this diet really helped my back pain”: perhaps it’s placebo, or perhaps it is one of those cases where different things work for different people, and one expects in such cases individuals to have privileged access to what worked for them.[14]
There will be cases where one really is plowing a lonely furrow where there aren’t any close epistemic peers or superiors. It’s possible I really am the world’s leading expert on “How many counter-factual DALYs does a doctor avert during their career?”, because no one else has really looked into this question. My current role involves investigating global catastrophic biological risks, which appears understudied to the point of being pre-paradigmatic.
These comprise a very small minority of topics I have credences about. Yet even here modesty can help. One can use more distant bodies of experts: I am reassured that my autumnal estimate for the ‘DALY question’ coheres with expert consensus that medical practice had a minor role in improvements to human health, for example. Even if I don’t have any epistemic peers, I can simulate some by asking, “If there were lots of people as or more reasonable than me looking at this, would I expect them to agree with my take?” Given that the econometric-esque methods I deploy to the answer the ‘DALY question’ could probably be done better by an expert, and in any case reasonable people are often sceptical of these in other areas, I am less confident of my findings than my ‘inside view’ suggests, which I take to be a welcome corrective to ‘pet argument’ biases.[15]
In theory, the world should be mad
Whether devoured by Moloch, burned by Ra, trapped by aberrant signalling equilibria, or whatever else, we can expect to predict when apparent expert classes (and apparent epistemic peers) are going to collectively go wrong. With this knowledge, we can know which topics we should expect to ourselves to outperform expertise. Rather than the scenario where we commonly find ourselves looking up (at experts) or around (at our peers), we find ourselves in many situations where those who are usually epistemic peers or superiors are below us—and above us, only sky.
We could distinguish two sorts of madness, a surprising absence of expertise and a surprising error of expertise:
The former is a gap in the epistemic market. Although an important topic should be combed over by a body of experts, for whatever reason it isn’t, and so it takes surprisingly little effort to climb to the summit of epistemic superiority. In such cases our summaries of expert classes as ranging over a broad area conceal the degree of expertise is very patchy: public health experts generally know a great deal about the health impacts of smoking; they usually know much less about the health impacts of nicotine.
The latter is a stronger debunking argument. One appeals to some features of the world that generates expertise and suggests that these expertise generating features are anti-correlated to the truth, thus one can adjudicate between warring expert camps (or just indict all so-called ‘experts’) based on this knowledge. One strong predictor of incompatibilism regarding free will among philosophers is believing in God. If we are confident these beliefs in God are irrational, then we can winnow the expert class by this consideration and side with the compatibilist camp much more strongly.
Yet, similar to the problems of debunking mentioned earlier, that there is a good story suggesting one of these things does not imply one will do better ‘striking out on one’s own’. Even in cases of disease where accuracy is poorly correlated to expert activity, it is hard to think of cases where these line up orthogonal or worse. Big pharma studies are infamous, but even if you’re in big pharma optimising for ‘can I get evidence to support my product’, your drug actually working does make this easier. Even in pre-replication crisis psychology, true results would be overrepresented versus false ones in the literature compared to some base rate across generated hypotheses.
The ‘residual’ expert class still often remains better. Although most public health experts know little about nicotine per se, there are some nearby health experts, perhaps scattered across our common-sense demarcation of fields, who do know about the impacts of nicotine. It may still take quite a lot of effort to reach parity or superiority to these. Even if we want to strike all theists from free will philosophers, compatibilism does not rise close to unanimity, and so cautions against extremely high confidence this is the correct view.[16] So, I aver, the world is not that mad.
Empirically, the world is mad
One can offer a more direct demonstration of world-madness, and so refute modesty: outperformance.
A common reply is to point to a particular case where those being modest would have gotten it wrong. There are lots of cases where amateurs and mavericks were ridiculed by common sense or experts-at-the-time, only to be subsequently vindicated.
Another problem is the modest view introduces a lag—it seems one often needs to wait for the new information to take root among one’s epistemic peers before changing one’s view, whilst a cogniser just relying on the object level updates on correct arguments ‘at first sight’. It is often crucially important to be fast as well as right in both empirical and moral matters: it is extremely costly if a view makes one slower to recognise (among many other past moral catastrophes) the horror of slavery.
Yet modesty need not infallible, merely an improvement. Citing cases where it goes poorly is (hopefully less than) half the story. Modesty does worse in cases the maverick is right, yet better where the maverick is wrong: there are more cases of the latter than the former. Modesty does worse in being sluggish in responding to moral revolutions, yet better at avoiding being swept away by waves of mistaken sentiment: again, the latter seem more common than the former.[17]
Maybe one can follow a strategy such that you can ‘pick the hits’ of when to carve out exceptions, and so have a superior track record. Yet, empirically, I don’t see it. When I look at people who are touted as particularly good at being ‘correct contrarians’, I see at best something like an ‘epistemic venture capitalist’ - their bold contrarian guesses are right more often than chance, but not right more often than not. They appear by my lights to be unable to judiciously ‘pick their battles’, staking out radical views in topics where there isn’t a good story as to why the experts would be getting this wrong (still less why they’re more likely to get it right). So although they do get big wins, the modal outcome of their contrarian take is a bust.[18]
Modesty should price in the views of better-than-chance contrarians into how it weighs consensus. Confidence in a consensus view should fall if a good contrarian takes aim at it, but not so much one now takes the contrarian view oneself. If one happens to be a particularly successful contrarian one should follow the same approach: “I get these right surprisingly often, but I’m still wrong more often than not, so it might be worth it to look into this further to see if I can strike gold, but until then I should bank on the consensus view.”
Expert groups are seldom in reflective equilibrium
Even if modesty works well in the ideal case of a clearly identified ‘expert class’, it can get a lot messier in reality:
Suppose one is in the early 1940s and asks, “Is there going to be explosives with many orders of magnitude more power than current explosives?” One can imagine if one consulted explosive experts (however we cash that out), their consensus would generally say ‘no’. If one was able to talk to the physicists working on the Manhattan project, they would say ‘yes’. Which one should an outside view believe?[19]
Most people believe god exists (the so called ‘common consent argument for God’s existence’); if one looks at potential expert classes (e.g. philosophers, people who are more intelligent), most of them are Atheists. Yet if one looks at philosophers of religion (who spend a lot of time on arguments for or against God’s existence), most of them are Theists—but maybe there’s a gradient within them too. Which group, exactly, should be weighed most heavily?
So constructing the ideal ‘weighted consensus’ modesty recommends deferring to can become a pretty involved procedure. One must carefully divine whether a given topic lies closer to the magisterium of one or another putative expert class (e.g. maybe one should lean more to the physicists, as the question is really more ‘about physics’ than ‘about explosives’). One might have to carefully weigh up the relevant epistemic virtues of various expert classes that appear far from reflective equilibrium from one another (so perhaps one might use likely selection effect of philosophy of religion party discount the apparent support this provides). One might have to delve into complicated issues of independence: although most people may believe god exists, unlike guesses of how many skittles are in the jar, they are not all forming this belief independently from one another.[20]
This exercise begins to look increasingly insider-view-esque. Trying to determine the right magisterium involves getting closer to object level considerations about ‘aboutness’ of topics; trying to tease apart issues of independence and selection amount to looking at belief forming practices, and veer close to object level justifications for the belief in question. At some point it becomes extraordinarily challenging to try and back-trace from all these factors to the likely position of the ideal observer: the degrees of freedom these considerations invite (and the challenge in estimating them reliably) make strong modesty go worse.
One should not give up too early, though: modesty can still work pretty well even in these tricky cases. One can ask whether there’s any communication between the classes, and if so any direction of travel (e.g. did some explosive experts end up talking to the physicists, and agreeing they were right? Vice-versa?), even if they were completely isolated, one can ask if a third group having access to both made a decision (e.g. the agreement of the U.S. and German governments with the implied view of the physicists). This is a lot more involved, but the expected ‘accuracy yield per unit time spent’ may still be greater than (for example) making a careful study of the relevant physics.
A broader modification would be ‘immodest only for the web of belief, but modest for the weights’: one uses an inside view to piece together the graph of considerations around P, but one still defers to consensus on the weights. This may avoid cases where (for example) strong modesty may mistake astronomers as the expert class for about space travel being infeasible (versus primordial rocket scientists), even though astronomers and rocket scientists agreed about the necessary acceleration, but astronomers were inexpert on the key question as to whether that explanation could be produced.[21]
What if one cannot even do that? Then modestly (rightly) offers a counsel of despair. If an area is so fractious there’s no agreement, with no way to see which of numerous of disparate camps have better access the truth of the matter; so suffused with bias that even those with apparent epistemic virtues (e.g. judgement, intelligence, subject-matter knowledge) cannot be seen to even tend towards the truth; what hope does one have to do better than they? In attempting to thread the needle through these hazards towards the right judgement, one will almost certainly run aground somewhere or somehow, alike all one’s epistemic peers or superiors who made the attempt before. Perhaps reality obliges us to undertake these doxastic suicide missions from time to time. If modesty cannot help us, it can at least provide the solace of a pre-emptive funeral, rather than (as immodest views would) cheer us on to our almost certain demise.
Somewhat satisfying Shulman
Carl Shulman encourages me to offer my credences and rationale in cases he takes to be particularly difficult for my view, and suggests in these cases I either arrive at absurd credences or I am covertly abandoning the strong modesty approach. I offer these below for readers to decide—with the rider that if these are in fact absurd, ‘I’m an idiot’ is a competing explanation to ‘strong modesty is a bad epistemic practice’ (and that, assuredly, whatever one’s credence on the latter, one’s credence in the former should be far greater).
Proposition (roughly)
Credence (ish)
(Modesty-based) rationale, in sketch
Theism
0.1[22]
Mostly discount common consent (non-independence) and PoR (selection). Major hits from more intelligent people/ better informed tend to be atheist, but struggle to extrapolate this closer to 0 given existence proofs of very epistemically virtuous religious people.
Libertarian free will
0.1
Commands a non-trivial minority across virtuous epistemic classes (philosophers, intelligent people, etc), only somewhat degraded by selection worries.
Jesus rose from the dead
0.005
Christianity in particular a very small fraction of possibility space of Theism. Support from its widespread support is mostly (but not wholly) screened off by non-independence effects. Relevant (but distant) expert classes in history etc. weigh adversely.
There has been a case of cold fusion
10^-5
Strong pan scientific consensus against, cold fusion community looks renegade and much less epistemically virtuous. Base rate of these conditional on no effect gives very adverse reference class.
ESP
10^-6
Very strong (but non-complete) trophism among elite common sense, scientists, etc; bad predictive track records for ESP researchers; distant consensuses highly adverse. Some greatly attenuated boost from survey data/small fraction of reasonable believers.
Practical challenges to modesty
Modesty can lead to double-counting, or even groupthink. Suppose in the original example Beatrice does what I suggest and revise their credences to be 0.6, but Adam doesn’t. Now Charlie forms his own view (say 0.4 as well) and does the same procedure as Beatrice, so Charlie now holds a credence of 0.6 as well. The average should be lower: (0.8+0.4+0.4)/3, not (0.8+0.6+0.4)/3, but the results are distorted by using one-and-a-half helpings of Adam’s credence. With larger cases one can imagine people wrongly deferring to hold consensus around a view they should think is implausible, and in general the nigh-intractable challenge from trying to infer cases of double counting from the patterns of ‘all things considered’ evidence.
One can rectify this by distinguishing ‘credence by my lights’ versus ‘credence all things considered’. So one can say “Well, by my lights the credence of P is 0.8, but my actual credence is 0.6, once I account for the views of my epistemic peers etc.” Ironically, one’s personal ‘inside view’ of the evidence is usually the most helpful credence to publicly report (as it helps others modestly aggregate), whilst ones all things considered modest view usually for private consumption.
Community benefits to immodesty
Modesty could be parasitic on a community level. If one is modest, one need never trouble oneself with any ‘object level’ considerations at all, and simply cultivate the appropriate weighting of consensuses to defer to. If everyone free-rode like that, no one would discover any new evidence, have any new ideas, and so collectively stagnate.[23] Progress only happens if people get their hands dirty on the object-level matters of the world, try to build models, and make some guesses—sometimes the experts have gotten it wrong, and one won’t ever find that out by deferring to them based on the fact they usually get it right.[24]
The distinction between ‘credence by my lights’ versus ‘credence all things considered’ allows the best of both worlds. One can say ‘by my lights, P’s credence is X’ yet at the same time ‘all things considered though, I take P’s credence to be Y’. One can form one’s own model of P, think the experts are wrong about P, and marshall evidence and arguments for why you are right and they are wrong; yet soberly realise that the chances are you are more likely mistaken; yet also think this effort is nonetheless valuable because even if one is most likely heading down a dead-end, the corporate efforts of people like you promises a good chance of someone finding a better path.
Scott Sumner seems to do something similar:
Despite this example, maybe it is the case that ‘having a creative brain which makes big discoveries’ is anticorrelated to ‘having a sober brain well-calibrated to its limitations compared to others’: anecdotally, eccentric views among geniuses are common. Maybe for most it isn’t psychologically tenable to spend one’s life investigating a renegade view one thinks ultimately is likely a dead-end, and in fact people do groundbreaking research generally have to be overconfident to do the best science. If so, we should act communally to moderate this cost, but not celebrate it as a feature.
Not everyone has to do be working on discovering new information. One could imagine a symbiosis between eccentric overconfident geniuses whose epistemic comparative advantage is to who gambol around idea-space to find new considerations, and well-calibrated thoughtful people whose comparative advantage is in soberly weighing considerations to arrive at a well callibrated all-things-considered view.
Conclusion: a pean, and a plea
I have argued above for a strong approach to modesty, one which implies—at least in terms of ‘all things considered view’ - one’s view of the object level merits counts for very little. Even if I am mistaken about the ideal strength of modesty, I am highly confident both the EA and rationalist communities err in the ‘insufficiently modest’ direction. I close on these remarks.
Rationalist/EA exceptionalism
Both communities endure a steady ostinato of complaints about arrogance. They’ve got a point. I despair of seeing some wannabe-iconoclast spout off about how obviously the solution to some famously recondite issue is X and the supposed experts who disagree obviously just need to better understand the ‘tenets of EA’ or the sequences. I become lachrymose when further discussion demonstrates said iconoclast has a shaky grasp of the basics, that they are recapitulating points already better-discussed in the literature, and so forth.[25]
To stress (and to pre-empt), the problem is not, “You aren’t kowtowing appropriately to social status!” The problem is considerable over-confidence married with inadequate understanding. This both looks bad to outsiders,[26] but it also is bad as the individual (and the community itself) could get to the truth faster if they were more modest about their likely position in the distribution of knowledge about X, and then did commonsensical things to increase it.
Consider Gell-Mann amnesia (via Michael Crichton):
Gell-Mann cases invite inferring adverse judgements based on extrapolating from in instance of poor performance. When experts in multiple different subjects say the same thing (i.e. Murray and Crichton chatted to an expert on Palestine who had the same impression), this adverse inference gets all the stronger.
Some, perhaps many, pieces of work or corporate projects in our community share this property: it might look good or groundbreaking to us as relatively less-informed, domain experts in the fields it touches upon tend to report it is misguided or rudimentary. Although it is possible to indict all these judgements, akin to a person who gives very adverse accounts of all of their previous romantic partners, we may start to wonder about a common factor explanation. Our collective ego is writing checks our epistemic performance (or, in candour, performance generally) cannot cash; general ignorance, rather than particular knowledge, may explain our self-regard.
To discover, not summarise
It is thought that to make the world go better new things need to be discovered, above and beyond making sound judgements on existing knowledge. Quickly making accurate determinations of the balance of reason for a given issue is greatly valuable for the latter, but not so much for the former.
Yet the two should not be confused. If one writes a short overview of a subject ‘for internal consumption’ which gives a fairly good impression of what a particular view should be, one should not be too worried if a specialist complains that you haven’t covered all the topics as adequately as one might. However, if one is aiming to write something which articulates an insight or understanding not just novel to the community, but novel to the world, one should be extremely concerned if domain experts review this work and say things along the lines of, “Well, this is sort of a potted recapitulation of work in our field, and this insight is widely discussed”.
Yet I see this happen a lot to things we tout as ‘breakthrough discoveries’. We want to avoid case where we waste our time in unwitting recapitulation, or fail to catch elementary mistakes. Yet too often we license ourselves to pronounce these discoveries without sufficient modesty in cases where there’s already a large expert community working on similar matters. This does not preclude these discoveries, but it cautions us to carefully check first. On occasions where I take myself to have a new insight in areas outside my field (most often philosophy), I am extremely suspect of my supposed discovery: all too often would this arise from my misunderstanding, or already be in the literature somewhere I haven’t looked. I carefully consult the literature as best as I can, and run the idea by true domain experts, to rule out these possibilities.[27]
Others seem to lack this modesty, and so predictably err. More generally, a more modest view of ‘intra-community versus outside competence’ may also avoid cases of having to reinvent the wheel (e.g. that scoring rule you spent six months deriving for a karma system is in this canonical paper), or for an effort to derail (e.g. oh drat, our evaluation provides worthless data because of reasons we could have known from googling ‘study design’).
Paradoxically pathological modesty
If the EA and rationalist communities comprised a bunch of highly overconfident and eccentric people buzzing around bumping their pet theories together, I may worry about overall judgement and how much novel work gets done, but I would at grant this at least looks like fertile ground for new ideas to be developed.
Alas, not so much. What occurs instead is agreement approaching fawning obeisance to a small set of people the community anoints as ‘thought leaders’, and so centralizing on one particular eccentric and overconfident view.[28] So although we may preach immodesty on behalf of the wider community, our practice within it is much more deferential.
I hope a better understanding of modesty can get us out of this ‘worst of both worlds’ scenario. It can at least provide better ‘gurus’ to defer to. Better, modesty also helps to correct two mistaken impressions: one, overly wide gap between our gurus and other experts; two, the overly narrow gap between ‘intelligent layperson in the community’ and ‘someone able to contribute to the state of the art’. Some topics are really hard: being able to become someone with ‘something useful to say’ about these not take days but take years; there are many deep problems we must concern ourselves with; that the few we select as champions, despite their virtue, cannot do them all alone; and that we need all the outside help we can get.
Coda
What the EA community mainly has now is a briar-patch of dilettantes: each ranges widely, but with shallow roots, forming whorls around others where it deems it can find support. What it needs is a forest of experts: each spreading not so widely; forming a deeper foundation and gathering more resources from the common ground; standing apart yet taller, and in concert producing a verdant canopy.[29] I hope this transformation occurs, and aver modesty may help effect it.
Acknowledgements
I thank Joseph Carlsmith, Owen Cotton-Barratt, Eric Drexler, Ben Garfinkel, Roxanne Heston, Will MacAskill, Ben Pace, Stefan Schubert, Carl Shulman, and Pablo Stafforini for their helpful discussion, remarks, and criticism. Their kind help does not imply their agreement. The errors remain my own.
[Edit 30/10: Rewording and other corrections—thanks to Claire Zabel and Robert Wiblin]
[1] Much of this follows discussion in the social epistemology literature about conciliationism, or the ‘equal weight view’. See here for a summary
[2] They also argue at length about the appropriate weight each of these considerations should have on the scales of judgement. I suggest (although this is not necessary for this argument) that in many cases most of the action lies in judging the ‘power’ of evidence. In most cases I observe people agree that a given consideration C influences the credence one holds in P; they usually also agree in its qualitative direction; the challenge comes in trying to weigh each consideration against the others, to see which considerations one’s credence over P should pay the greatest attention to.
This may represent a general feature of webs of belief being dense and many-many (A given credence is influenced by many other considerations, and forms a consideration for many credences in turn), or it may simply be a particular feature of webs of belief in which humans perform poorly: although I am confident I can determine the sign of a particular consideration, I generally don’t back myself to hold credences (or likelihood ratios) to much greater precision than the first significant digit, and I (and, perhaps, others) struggle in cases where large numbers of considerations point in both directions.
[3] In the literature this is called ‘straight averaging’. For a variety of technical reasons this doesn’t quite work as a peer update rule. That said, given things like bayesian aggregation remain somewhat open problems, I hope readers will accept my promissory note that there will be a more precise account which will produce effectively the same results (maybe ‘approximately splitting the difference’) through the same motivation.
[4] C.f. Aumann’s agreement theorem. As an aside (which I owe to Carl Shulman), straight averaging will not work in some degenerate cases where (similar to ‘common knowledge puzzles’) one can infer precise observations from the probabilities stated. The neatest example I can find comes from Hal Finney (see also):
[5] To motivate: Adam and Beatrice no longer know whether or not reasons they hold for or against P are private evidence or not. Yet (given epistemic peerhood), they have no principled reason to suppose “I know something that they don’t” is more plausible than the opposite. So again they should be symmetrical.
[6] (On which more later) it is worth making clear that the possibility of bias for either Adam or Beatrice doesn’t change the winning strategy on expectation. Say Adam’s credence for P is in fact biased upwards by 0.4. If Adam knows this, he can adjust and become unbiased, if Oliver or Beatrice knows this (and knows Adam doesn’t), the break the peerhood for Adam but can simulate unbiased Adam* which would remain a peer, and act accordingly. If none of them know this, then it is the case that Beatrice wins, as does Oliver following a non-averaging ‘go with Beatrice’ strategy. Yet this is simply epistemic luck: without information, all reasonable prior distribution candidates of (Adam’s bias—Beatrice’s bias) are symmetrical about 0.
[7] Another benefit of modesty is speed: Although it is the case Adam and Beatrice’s credence (and thus the average) gets more accurate if they have time to discuss it, and so catch one another if they make a mistake or reveal previously-private evidence, averaging is faster and the trade-off in time for better precision may not be worth it. It still remains the case, as per the first example, that they still do better, after this discussion, if they meet in the middle on residual disagreement.
[8] A further (albeit minor and technical) dividend is that although individual guesses may form any distribution (for which the standard deviation may not be a helpful summary), the central limit theorem applies to the average of guesses distribution, so it tends to normality.
[9] Even if one is the world authority, there should be some deference to lesser experts. In cases where the world expert is an outlier, one needs to weigh up numbers versus (relative) epistemic superiority to find the appropriate middle.
[10] God from the Mount of Sinai, whose gray top
Shall tremble, he descending, will himself
In Thunder Lightning and loud Trumpets sound
Ordaine them Lawes…
Milton, Paradise Lost
[11] I take the general pattern that strong modesty usually immures one from common biases is a further point in its favour.
[12] I owe this to Eric Drexler
[13] A related philosophical defence would point out that the self-undermining objection would only apply to whether one should believe modesty, not whether modesty is in fact true.
[14] I naturally get much more sceptical if that person then generalises from this N=1 uncontrolled unblinded crossover trial to others, or takes it as lending significant support against some particular expert consensus or expertise more broadly: “Doctors don’t know anything about back pain! They did all this rubbish but I found out all anyone needs to do is cut carbs!”
[15] It also provokes fear and trembling in my pre-paradigmatic day job, given I don’t want the area to have strong founder effects which poorly track the truth.
[16] For example:
[17] Aside: A related consideration is ‘optimal damping’ of credences, which is closely related to resilience. Very volatile credences may represent the buffeting of a degree of belief by evidence large relative to one’s prior—but it may also represent poor calibration in overweighing new evidence (and vice versa). The ‘ideal’ response in terms of accuracy is given by standard theory. Yet it is also worth noting that’s one prudential reasons may want to introduce further lag or lead, akin to the ‘D’ or ‘I’ components of a PID controller. In large irreversible decisions (e.g. career choice) it may be better to wait a while after one’s credences support a change to change action; for case of new moral consideration it may be better to act ‘in advance’ for precautionary principle-esque reasons.
[18] (Owed to Will MacAskill) There’s also a selection effect: of a sample of ‘accurate contrarians’, many of these may be lucky rather than good.
[19] I owe this particular example to Eric Drexler, but similar counter-examples along these lines to Carl Shulman.
[20] Another general worry is these difficult-to-divine considerations offer plenty of fudge factors—both to make modesty get the ‘right answer’ in historical cases, and to fudge present areas of uncertainty to get results that accord with one’s prior judgement.
[21] I owe both this modification and example to discussions with Eric Drexler. There are some costs—one may think there are cases one should defer to an outside view on the web of belief (E.g. Christian apologist: “Sure, I agree with scientific consensus that it’s improbable Jesus rose naturally from the dead, but the key argument is whether Jesus rose supernaturally from the dead. So the consensus for philosophers of religion is the right expert class.”) The balance of merit overall is hard to say, but such a modification still looks like pretty strong modesty.
[22] In conversation I recall a suggestion by Shulman such a credence should change one’s behaviour regarding EA—maybe one should do theology research in the hope of finding a way to extract infinite value etc. Yet the expert class for action|Theism gives a highly adverse prior: virtually no actual theists (regardless of theological expertise, within or outside EA) advocate this.
[23] I understand a similar point is raised in economics regarding the EMH and the success of index funds. Someone has to do the price discovery.
[24] I owe this mainly to Ben Pace, Andrew Critch argues similarly.
[25] For obvious reasons I’m reluctant to cite specific examples. I can offer some key words for the sort of topics I see this problem as endemic: Many-worlds, population ethics, free will, p-zombies, macroeconomics, meta-ethics.
[26] C.f. Augustine, On the Literal Meaning of Genesis:
[27] I’m uncommonly fortunate that for me such domain experts are both nearby and generous with their attention. Yet this obstacle is not insurmountable. An idea (which I owe to Pablo Stafforini) is that a contrarian and a sceptic of the contrarian view could bet on whether a given expert, on exposure to the contrarian view, would change their mind as the contrarian predicts. S may bet with C: “We’ll pay some expert $X to read your work explicating your view, if they change their mind significantly in favour (however we cash this out) I’ll pay the $X, if not, you pay the $X.
[28] C.f. Askell’s and Page’s remarks on ‘buzz’.
[29] Perhaps unsurprisingly, I would use a more modest ecological metaphor in my own case. In reclaiming extremely inhospitable environments, the initial pioneer organisms die rapidly. Yet their corpses sustain detritivores, and little by little, an initial ecosystem emerges to be succeeded by others. In a similar way, I hope that the detritus I provide will, after a fashion (and a while), become the compost in which an oak tree grows.