OllieBase comments on ITN 201: pitfalls in ITN BOTECs

OllieBase 13 Aug 2025 8:08 UTC
8 points
3 ∶ 0
This is a great post!

> ITN estimates sometimes consider broad versions of the problem when estimating importance and narrow versions when estimating total investment for the neglectedness factor (or otherwise exaggerate neglectedness), which inflates the overall results
I really like this framing. It isn’t an ITN estimate, but a related claim I think I’ve seen a few times in EA spaces is:
“billions/trillions of dollars are being invested in AI development, but very few people are working on AI safety”
I think this claim:
- Seems to ignore large swathes of work geared towards safety-adjacent things like robustness and reliability.
- Discounts other types of AI safety “investments” (e.g., public support, regulatory efforts).
- Smuggles in a version of “AI safety” that actually means something like “technical research focused on catastrophic risks motivated by a fairly specific worldview”.
I still think technical AI safety research is probably neglected, and I expect there’s an argument here that does hold up. I’d love to see a more thorough ITN on this.
- Lizka 20 Aug 2025 13:39 UTC
  4 points
  0 ∶ 0
  Parent
  Yeah, this sort of thing is partly why I tend to feel better about BOTECs like (writing very quickly, tbc!):
  What could we actually accomplish if we (e.g.) doubled (the total stock/ flow of) investment in ~technical AIS work (specifically the stuff focused on catastrophic risks, in this general worldview)? (you could broaden if you wanted to, obviously)
  Well, let’s see:
  That might look like:
  adding maybe ~400(??) FTEs similar (in ~aggregate) to the folks working here now, distributed roughly in proportion to current efforts / profiles — plus the funding/AIS-specific infrastructure (e.g. institutional homes) needed to accomodate them
  E.g. across intent alignment stuff, interpretability, evals, AI control, ~safeguarded AI, AI-for-AIS, etc., across non-profit/private/govt (but in fact aimed at loss of control stuff).
  How good would this be?
  Maybe (per year of doubling) we’d then get something like a similar-ish value from this as we don from a year of current space (or something like 2x less, if we want to eyeball diminishing returns)
  Then maybe we can look at what this space has accomplished in the past year and see how much we’d pay for that / how valuable that seems...
  (What other ~costs might we be missing here?)
  You might also decide that you have much better intuitions for how much we’d accomplish (and how valuable that’d be) on a different scale (e.g. adding one project like Redwood/Goodfire/Safeguarded AI/..., i.e. more like 30 FTEs than 400 — although you’d probably want to account for considerations like “for each ‘successful’ project we’d likely need to invest in a bunch of attempts/ surrounding infrastructure...”), or intuitions about what amount of investment is required to get to some particular desired outcome...
  Or if you took the more ITN-style approach, you could try to approach the BOTEC via something like (1) how much investment has there been so far in this broad ~POV / porftolio, (2 (option a)) how much value/progress has this portfolio made + something like “how much has been made in the second half?” (to get a sense of how much we’re facing diminishing returns at the moment — fwiw without thinking too much about it I think “not super diminishing returns at the mo”), or (2 (option b)) what fraction of the overall “AI safety problem” is “this-sort-of-safety-work-affectable” (i.e. something like “if we scaled up this kind of work — and only this kind of work — to an insane degree, how much of the problem will be fixed?”) + how big/important the problem is overall… Etc. (Again, for all of this my main question is often “what are the sources of signal or taste / heuristics / etc. that you’re happier basing your estimates on?)