Yeah, I think this is a totally fair critique and I updated some after reading it!
I wrote the above after a long Slack conversation with Aaron at like 2AM, just trying to capture the rough shape of the argument without spending too much time on it.
I do think actually chasing this argument all the way through is interesting and possibly worth it. I think it’s pretty plausible it could make a 2-3x difference in the final outcome (and possibly a lot more!), and I hadn’t actually thought through it all the way. And while I had some gut sense it was important to differentiate between median and tail outcomes here, I hadn’t properly thought through the exact relationship between the two and am appreciative of you doing some more of the thinking.
I currently prefer your estimate of “moving it from 20% to 38%” as something like my best guess.
So, one thing I was thinking about was that people frequently use the murder-rate as a proxy for the overall crime rate, and I think I remember people doing that without any adjustment of the type you are thinking about here. Is there something special about the murder rate as a fraction of violent crimes, or should we actually make the same adjustments in that case?
I think similar adjustments should be made if you are extrapolating to crimes with very different prevalence. For example, the US murder rate is 4-5x that of the UK, but I wouldn’t expect the US to have that many more bike thefts.
Proxy seems fine if you’re focused on which country/city/etc. has higher overall crime, rather than estimating magnitude.
(FWIW, attempt at Googling the above suggest ~300k bike thefts per year in UK versus 2m in US, US population 5x bigger so that’s only 1.33x the UK rate. A quick check on bicycle sales in the two countries does not suggest that this is because of very different cycling rates. No links because on phone, but above is very rough anyway. I’m left with somewhat greater confidence that the gap is in fact <<4x, like 1.2x − 2x, though.)
Similar comments could be made about extrapolating from the large number of US billionaires (way more per capita than any other country IIRC) to the relative rates of people earning more than $200k/$50k/etc. That case might be more intuitive.
A less important motivation/mechanism is probabilities/ratios (instead of odds) are bounded above by one. For rare events ‘doubling the probability’ versus ‘doubling the odds’ get basically the same answer, but not so for more common events. Loosely, flipping a coin three times ‘trebles’ my risk of observing it landing tails, but the probability isn’t 1.5. (cf).
E.g.
Sibling abuse rates are something like 20% (or 80% depending on your definition). And is the most frequent form of household abuse. This means by adopting a child you are adding something like an additional 60% chance of your other child going through at least some level of abuse (and I would estimate something like a 15% chance of serious abuse). [my emphasis]
If you used the 80% definition instead of 20%, then the ‘4x’ risk factor implied by 60% additional chance (with 20% base rate) would give instead an additional 240% chance.
[(Of interest, 20% to 38% absolute likelihood would correspond to an odds ratio of ~2.5, in the ballpark of 3-4x risk factors discussed before. So maybe extrapolating extreme event ratios to less-extreme event ratios can do okay if you keep them in odds form. The underlying story might have something to do with logistic distributions closely resemble normal distributions (save at the tails), so thinking about shifting a normal distribution across the x axis so (non-linearly) more or less of it lies over a threshold loosely resembles adding increments to log-odds (equivalent to multiplying odds by a constant multiple) giving (non-linear) changes when traversing a logistic CDF.
But it still breaks down when extrapolating very large ORs from very rare events. Perhaps the underlying story here may have something to do with higher kurtosis : ‘>2SD events’ are only (I think) ~5X more likely than >3SD events for logistic distributions, versus ~20X in normal distribution land. So large shifts in likelihood of rare(r) events would imply large logistic-land shifts (which dramatically change the whole distribution, e.g. an OR of 10 makes evens --> >90%) much more modest in normal-land (e.g. moving up an SD gives OR>10 for previously 3SD events, but ~2 for previously ‘above average’ ones)]
Yep, I should have definitely kept the probabilities in log-form, just to be less confusing. It wouldn’t have made a huge difference to the outcome, but it seems better practice than the thing that I did.
Yeah, I think this is a totally fair critique and I updated some after reading it!
I wrote the above after a long Slack conversation with Aaron at like 2AM, just trying to capture the rough shape of the argument without spending too much time on it.
I do think actually chasing this argument all the way through is interesting and possibly worth it. I think it’s pretty plausible it could make a 2-3x difference in the final outcome (and possibly a lot more!), and I hadn’t actually thought through it all the way. And while I had some gut sense it was important to differentiate between median and tail outcomes here, I hadn’t properly thought through the exact relationship between the two and am appreciative of you doing some more of the thinking.
I currently prefer your estimate of “moving it from 20% to 38%” as something like my best guess.
So, one thing I was thinking about was that people frequently use the murder-rate as a proxy for the overall crime rate, and I think I remember people doing that without any adjustment of the type you are thinking about here. Is there something special about the murder rate as a fraction of violent crimes, or should we actually make the same adjustments in that case?
I think similar adjustments should be made if you are extrapolating to crimes with very different prevalence. For example, the US murder rate is 4-5x that of the UK, but I wouldn’t expect the US to have that many more bike thefts.
Proxy seems fine if you’re focused on which country/city/etc. has higher overall crime, rather than estimating magnitude.
(FWIW, attempt at Googling the above suggest ~300k bike thefts per year in UK versus 2m in US, US population 5x bigger so that’s only 1.33x the UK rate. A quick check on bicycle sales in the two countries does not suggest that this is because of very different cycling rates. No links because on phone, but above is very rough anyway. I’m left with somewhat greater confidence that the gap is in fact <<4x, like 1.2x − 2x, though.)
Similar comments could be made about extrapolating from the large number of US billionaires (way more per capita than any other country IIRC) to the relative rates of people earning more than $200k/$50k/etc. That case might be more intuitive.
A less important motivation/mechanism is probabilities/ratios (instead of odds) are bounded above by one. For rare events ‘doubling the probability’ versus ‘doubling the odds’ get basically the same answer, but not so for more common events. Loosely, flipping a coin three times ‘trebles’ my risk of observing it landing tails, but the probability isn’t 1.5. (cf).
E.g.
If you used the 80% definition instead of 20%, then the ‘4x’ risk factor implied by 60% additional chance (with 20% base rate) would give instead an additional 240% chance.
[(Of interest, 20% to 38% absolute likelihood would correspond to an odds ratio of ~2.5, in the ballpark of 3-4x risk factors discussed before. So maybe extrapolating extreme event ratios to less-extreme event ratios can do okay if you keep them in odds form. The underlying story might have something to do with logistic distributions closely resemble normal distributions (save at the tails), so thinking about shifting a normal distribution across the x axis so (non-linearly) more or less of it lies over a threshold loosely resembles adding increments to log-odds (equivalent to multiplying odds by a constant multiple) giving (non-linear) changes when traversing a logistic CDF.
But it still breaks down when extrapolating very large ORs from very rare events. Perhaps the underlying story here may have something to do with higher kurtosis : ‘>2SD events’ are only (I think) ~5X more likely than >3SD events for logistic distributions, versus ~20X in normal distribution land. So large shifts in likelihood of rare(r) events would imply large logistic-land shifts (which dramatically change the whole distribution, e.g. an OR of 10 makes evens --> >90%) much more modest in normal-land (e.g. moving up an SD gives OR>10 for previously 3SD events, but ~2 for previously ‘above average’ ones)]
Yep, I should have definitely kept the probabilities in log-form, just to be less confusing. It wouldn’t have made a huge difference to the outcome, but it seems better practice than the thing that I did.