Thank you! This is a genuinely good question. (Note: I answered via voice and then edited the transcript below with Chat—can circle back if style is an issue, but this covers every point I discussed—if doing this is a problem for some reason I’m happy to write anew! The content is correct):
Your question surfaces the key misunderstanding. The claim isn’t that we should fear drugs that help some people and hurt others. It’s that our measurement architecture is set up in a way that systematically misclassifies who is helped, who is harmed, and by how much, because the scales themselves flatten the underlying geometry of experience. Once you compress long-tailed intensities into a 1–10 box and then average them, you lose the structure that actually matters for real-world well-being.
In a world where symptoms behave linearly and add up nicely, a drug that helps half and hurts half is perfectly intelligible: you imagine two overlapping Gaussians, shrug, and say “worth a try.” But that isn’t the world we actually inhabit. If rumination goes from 6 to 4, the subjective win might be modest because you’re moving along a shallow part of the curve. If akathisia goes from 6 to 8, the subjective loss might be massive because you’ve crossed into a steep tail where each step carries exponential experiential weight. On the form, these are both “two-point changes.” In lived reality, they belong to different moral universes. This asymmetry in the tails means that “50% better, 50% worse” is not a neutral mixture; the average hides the fact that the extremes on one side can dominate the arithmetic.
I don’t think it is abstract or merely theoretical, or too complex to do anything about. It has immediate practical consequences. Trials and regulators work with compressed reports, so the deepest harms appear as mild perturbations in the dataset. Drugs whose side-effect profiles involve steep-tail states like akathisia or mixed autonomic rebound look safer than they really are for a meaningful minority of users. Clinicians then inherit an evidence base where the worst experiential states have been squashed into “mild adverse events,” and that shapes expectations, heuristics, and prescribing norms. The problem is not clinician negligence — it’s that the underlying data they rely on has already thrown away the signal.
If we took the geometry seriously, we’d end up with a very different picture. High-variance drugs can be extraordinarily useful when we know how to identify responders and anti-responders. What we’re missing is the mapping. With better instruments, you’d get early detection of bad trajectories, N-of-1 response curves, and a more honest sense of which symptom profiles are compatible with which medications. The same drug could be life-changing for one subgroup and acutely harmful for another, and we could actually see that, instead of blending the two together into a 0.3σ effect size. This is less “anti-medication” and more “finally doing the epistemology correctly.”
Good clinical practice already tries to rely on patient narratives, but even that is downstream of the larger culture of interpretation we’ve built on top of flattened scales. When the scientific literature underweights the steepest affective states, everyone downstream learns to underweight them too. The patient who says “this made my inner restlessness unbearable” is intuitively competing with a literature that reports only “mild activation” for the same phenomenon. Countless examples of victims of this dynamic can be mentioned and documented.
The upshot is simple: this not trying to be pessimistic toward psychiatric meds. The core point is about epistemic clarity. The experiential landscape is long-tailed, clustered, and nonlinear; our measurement system is linear, additive, and tidy. When you force one onto the other, you get averages that obscure the very variation we need to guide good decisions. A better measurement pipeline wouldn’t make us more cautious or more reckless; it would make us more accurate. And accuracy is the only way to use high-variance interventions wisely — whether you’re trying to help one patient or setting policy for millions.
If the world ran fully on the arithmetic of symptom sheets, none of this would matter. But the world runs on compounding long-tail distributions of suffering and relief, and that geometry is strange, heavy-tailed, and morally lopsided. Our tools need to catch up.
Thank you! This is a genuinely good question. (Note: I answered via voice and then edited the transcript below with Chat—can circle back if style is an issue, but this covers every point I discussed—if doing this is a problem for some reason I’m happy to write anew! The content is correct):
Your question surfaces the key misunderstanding. The claim isn’t that we should fear drugs that help some people and hurt others. It’s that our measurement architecture is set up in a way that systematically misclassifies who is helped, who is harmed, and by how much, because the scales themselves flatten the underlying geometry of experience. Once you compress long-tailed intensities into a 1–10 box and then average them, you lose the structure that actually matters for real-world well-being.
In a world where symptoms behave linearly and add up nicely, a drug that helps half and hurts half is perfectly intelligible: you imagine two overlapping Gaussians, shrug, and say “worth a try.” But that isn’t the world we actually inhabit. If rumination goes from 6 to 4, the subjective win might be modest because you’re moving along a shallow part of the curve. If akathisia goes from 6 to 8, the subjective loss might be massive because you’ve crossed into a steep tail where each step carries exponential experiential weight. On the form, these are both “two-point changes.” In lived reality, they belong to different moral universes. This asymmetry in the tails means that “50% better, 50% worse” is not a neutral mixture; the average hides the fact that the extremes on one side can dominate the arithmetic.
I don’t think it is abstract or merely theoretical, or too complex to do anything about. It has immediate practical consequences. Trials and regulators work with compressed reports, so the deepest harms appear as mild perturbations in the dataset. Drugs whose side-effect profiles involve steep-tail states like akathisia or mixed autonomic rebound look safer than they really are for a meaningful minority of users. Clinicians then inherit an evidence base where the worst experiential states have been squashed into “mild adverse events,” and that shapes expectations, heuristics, and prescribing norms. The problem is not clinician negligence — it’s that the underlying data they rely on has already thrown away the signal.
If we took the geometry seriously, we’d end up with a very different picture. High-variance drugs can be extraordinarily useful when we know how to identify responders and anti-responders. What we’re missing is the mapping. With better instruments, you’d get early detection of bad trajectories, N-of-1 response curves, and a more honest sense of which symptom profiles are compatible with which medications. The same drug could be life-changing for one subgroup and acutely harmful for another, and we could actually see that, instead of blending the two together into a 0.3σ effect size. This is less “anti-medication” and more “finally doing the epistemology correctly.”
Good clinical practice already tries to rely on patient narratives, but even that is downstream of the larger culture of interpretation we’ve built on top of flattened scales. When the scientific literature underweights the steepest affective states, everyone downstream learns to underweight them too. The patient who says “this made my inner restlessness unbearable” is intuitively competing with a literature that reports only “mild activation” for the same phenomenon. Countless examples of victims of this dynamic can be mentioned and documented.
The upshot is simple: this not trying to be pessimistic toward psychiatric meds. The core point is about epistemic clarity. The experiential landscape is long-tailed, clustered, and nonlinear; our measurement system is linear, additive, and tidy. When you force one onto the other, you get averages that obscure the very variation we need to guide good decisions. A better measurement pipeline wouldn’t make us more cautious or more reckless; it would make us more accurate. And accuracy is the only way to use high-variance interventions wisely — whether you’re trying to help one patient or setting policy for millions.
If the world ran fully on the arithmetic of symptom sheets, none of this would matter. But the world runs on compounding long-tail distributions of suffering and relief, and that geometry is strange, heavy-tailed, and morally lopsided. Our tools need to catch up.