rosehadshar

Karma: 3,030

rosehadshar 23 Apr 2026 13:56 UTC
4 points
0 ∶ 0
in reply to: Alfredo Parra 🔸’s comment on: Epistemic interventions are a promising way to reduce power concentration
Yeah thanks for flagging, and sorry! This was written in a v inside baseball way and I didn’t spend time making it properly legible.
The lazy answer to what I’m thinking of when I say ‘epistemic intervention’ is the things we talk about in these design sketches: https://www.forethought.org/research/design-sketches-for-a-more-sensible-world

rosehadshar 14 Apr 2026 11:13 UTC
4 points
0 ∶ 0
in reply to: Oliver Sourbut’s comment on: AI for epistemics: the good, the bad and the ugly
Fixed, sorry!

rosehadshar 16 Dec 2025 16:11 UTC
9 points
1 ∶ 0
on: Rerunning the Time of Perils
I guess my prior coming into this is that non-existential catastrophes are still pretty existentially important, because:
- they are bad in and of themselves
- they are destabilising and make it more likely that we end up with existential catastrophes
  - I definitely wasn’t thinking explicitly about post-ASI catastrophes meaning we’d have to rerun the time of perils
  - But I was thinking about stuff like ‘a big war would probably set back AI development and could also make culture and selection pressures a fair bit worse, such that I feel worse about the outcome of AI development after that’. And similarly for bio
It sounds like your prior was that non-existential catastrophes are much much less important than existential ones, and then these considerations are a big update for you.
So I think part of why I’m less interested in this than you are is just having different priors where this update is fairly small/doesn’t change my prioritisation that much?

rosehadshar 15 Dec 2025 8:58 UTC
4 points
0 ∶ 0
in reply to: Will Aldred’s comment on: New 80k problem profile: extreme power concentration
Thanks for the quibble, seems big if true! And agreed it is not something that I was tracking when writing the article.

A few thoughts:
- I am fairly unsure if the economies of scale point is actually right. Some reasons for doubt:
  - Partly I’m thinking of Drexler’s CAIS arguments and intuitions that ecosystems of different specialised systems will outcompete monoculture
  - Partly I’m looking at AI development today
  - Partly the form of the economies of scale argument seems to be ‘one constraint on human economies of scale is coordination costs between humans. So if those are removed, economies of scale will go to infinity!’ But there may well be other trade offs that you reach at higher levels. For example, I’d expect that you lose out on things like creativity/innovation, and that you run higher risks of correlated failures, vulnerabilities etc.
- Assuming it is true, it doesn’t seem like the most important argument within economic dominance to me:
  - The most natural way of thinking about it for me is that AGI increasing economies of scale is a subset of outgrowing the world (where the general class is ‘having better AI enables growing to most of the economy’, and the economies of scale sub-class is ‘doing that via using copies of literally the same AI, such that you get more economies of scale’
  - Put another way, I think the economies of scale thing only leads to extreme power concentration in combination with a big capabilities gap. If lots of people have similarly powerful AI systems, and can use them to figure out that they’d be best off by using a single system to do everything, then I don’t see any reason why one country would dominate. So it doesn’t seem like an independent route to me, it’s a particular form of a route that is causally driven by another factor.
Interested in your takes here!

rosehadshar 15 Dec 2025 8:49 UTC
4 points
0 ∶ 0
in reply to: MichaelDickens’s comment on: New 80k problem profile: extreme power concentration
Thanks for the comment Michael.
A minor quibble is that I think it’s not clear you need ASI to end up with dangerous levels of power concentration, so you might need to ban AGI, and to do that you might need to ban AI development pretty soon.

I’ve been meaning to read your post though, so will do that soon.

rosehadshar 21 Oct 2025 11:23 UTC
2 points
0 ∶ 0
in reply to: Emergence101’s comment on: Persistence, Not Projection: The Case for Loop Maintenance over Longtermism
Thanks for your post AJ, and esp this comment which I found clarifying.
I’d be genuinely curious to hear how Cotton-Barratt and Hadshar see this difference. Is it a meaningful distinction? Are these frameworks reconcilable at different scales of analysis? When would we know which better serves long-term flourishing?
I’ve only skimmed your post, and haven’t read what me and Owen wrote in several years, but my quick take is:
- We’re saying ‘within a particular longtermist frame, it’s notable that it’s still rational to allocate resources to neartermist ends, for instrumental reasons’
  - I think you agree with this
  - Since writing that essay, I’m now more worried about AI making humans instrumentally obsolete, in a way that would weaken this dynamic a lot (I’m thinking of stuff like the intelligence curse). So I don’t actually feel confident this is true any more.
- I think you are saying ‘but that is not a good frame, and in fact normatively we should care about some of those things intrinsically’
  - I agree, at least partially. I don’t think we intended to endorse that particular longtermist frame—just wanted to make the argument that even if you have it, you should still care about neartermist stuff. (And actually, caring instrinsically about neartermist stuff is part of what motivated making the argument, iirc.)
  - I vibed with some of your writing on this, e.g. “The Tuesday-morning maintenance network isn’t preparation for a future we’re aiming toward; it is the future, continuously instantiated.”
  - I’m not a straight out yes—I think Wednesday in a million years might matter much more than this Tuesday morning, and am pretty convinced of some aspects of longtermism. But I agree with you in putting intrinsic value on the present moment and people’s experiences in it
So my guess is, you have a fundamental disagreement with some version of longtermism, but less disagreement with me than you thought.

rosehadshar 15 Oct 2025 12:38 UTC
10 points
0 ∶ 0
on: A personal take on why you should work at Forethought (maybe)
Thanks Lizka!
Some misc personal reflections:
- Working at Forethought has been my favourite job ever, by a decent margin
- I spent a couple of years doing AI governance research independently/collaborating with others in an ad hoc way before joining Forethought. I think the quality of my work has been way higher since joining (because I’ve been working on more important questions than I was able to make headway on solo), and it’s also been just a huge win in terms of productivity and attention (the costs of tracking my time, hustling for new projects, managing competing projects etc were pretty huge for me and made it really hard to do proper thinking)
One minor addition from me on why/not to work at Forethought: I think the people working at Forethought care pretty seriously about things going well, and are really trying to make a contribution.
I think this is both a really special strength, and something that has pitfalls:
- It’s a privilege to work with people who care in this way, and it cuts a lot of the crap that you’d get in organisations that were more oriented towards short term outcomes, status, etc
- On the other hand, I sometimes worry about Forethought leaning a bit to heavily on EA-style ‘do what’s most impactful’ vibes. I think this can kill curiosity, and also easily degrades into trying to try/people trying to meet their own psychological needs to make an impact instead of really staring in the face the reality we seem to be living in.
  - Other people at Forethought think that we’re not leaning into this enough though: most work on AI futures stuff is low quality and won’t matter at all, and it’s very easy to fill all your time with interesting and pointless stuff. I agree on those failure modes, but disagree about where the right place on the spectrum is.
And then a few notes on the sorts of people I’d be really excited to have apply:
- People who are thinking for themselves and building their own models of what’s going on. I think this is rare and sorely needed. Some particular sub-groups I want to call out:
  - Really smart independent thinkers who want to work on AI macrostrategy stuff but haven’t yet had a lot of surface area with the topic or done a lot of research. I think Forethought could be a great place for someone to soak up a lot of the existing thinking on these topics, en route to developing their own agenda.
  - Researchers with deep world models on the AI stuff, who think that Forethought is kind of wrong/a lot less good than it could be. The high-level aspiration for Forethought is something like, get the world to sensibly navigate the transition to superintelligence. We are currently 6 researchers, with fairly correlated views: of course we are totally failing to achieve this aspiration right now. But it’s a good aspiration, and to the extent that someone has views on how to better address it, I’d love for them to apply.
    If I got to choose one type of researcher to hire, it would be this one.
    My hope would be that for many people in this category, Forethought would be able to ‘get out of the way’: give the person free reign, not entangle them in organisational stuff where they don’t want that, and engage with them intellectually to the extent that it’s mutually productive.
    I agree with Lizka that people who think Forethought sucks probably won’t want to apply/get hired/enjoy working at Forethought.
- People who are working on this stuff already, but hamstrung by not having [a salary/colleagues/an institutional home/enough freedom for research at their current place of work/a manager to support them/etc]. I’d hope that Forethought could be a big win for people in this position, and allow them to unlock a bunch more of their potential.

rosehadshar 24 Mar 2025 16:22 UTC
2 points
0 ∶ 0
in reply to: Chris Leong’s comment on: Intelsat as a Model for International AGI Governance
Sorry for the slow response here! Agree that diffusion is an important issue. A few thoughts:
- Some forms of diffusion might be actively good, for reducing concentration of power. So it’s not clear that we want to straightforwardly prevent tech diffusion
- Ways you could reduce tech diffusion within something like Intelsat:
  - Limited membership helps
  - You could do things like require companies it contracts with to comply with strong infosec, require members not to allow frontier development without strong infosec, require member governments to provide gov-level infosec to frontier developers in their countries
  - Intelsat for satellites involved sharing all the technical information. For AGI, it could involve sharing only some forms of information (e.g. weights don’t get shared with everyone, but encrypted chunks of the weights are distributed among founder members)
  - h/t Will: having many countries part of the multilateral project removes their incentives to try to develop frontier AI themselves (and potentially open-source)

rosehadshar 24 Mar 2025 16:21 UTC
2 points
0 ∶ 0
in reply to: Chris Leong’s comment on: Intelsat as a Model for International AGI Governance
Sorry for the slow response here! Agree that diffusion is an important issue. A few thoughts:
- Some forms of diffusion might be actively good, for reducing concentration of power. So it’s not clear that we want to straightforwardly prevent tech diffusion
- Ways you could reduce tech diffusion within something like Intelsat:
  - Limited membership helps
  - You could do things like require companies it contracts with to comply with strong infosec, require members not to allow frontier development without strong infosec, require member governments to provide gov-level infosec to frontier developers in their countries
  - Intelsat for satellites involved sharing all the technical information. For AGI, it could involve sharing only some forms of information (e.g. weights don’t get shared with everyone, but encrypted chunks of the weights are distributed among founder members)
  - h/t Will: having many countries part of the multilateral project removes their incentives to try to develop frontier AI themselves (and potentially open-source)

rosehadshar 5 Dec 2024 13:33 UTC
5 points
0 ∶ 0
in reply to: Xavier_ORourke’s comment on: Should there be just one western AGI project?
I agree that it’s not necessarily true that centralising would speed up US development!
(I don’t think we overlook this: we say “The US might slow down for other reasons. It’s not clear how the speedup from compute amalgamation nets out with other factors which might slow the US down:
- Bureaucracy. A centralised project would probably be more bureaucratic.
- Reduced innovation. Reducing the number of projects could reduce innovation.”)
Interesting take that it’s more likely to slow things down than speed things up. I tentatively agree, but I haven’t thought deeply about just how much more compute a central project would have access to, and could imagine changing my mind if it were lots more.

rosehadshar 1 Oct 2024 13:30 UTC
2 points
0 ∶ 0
in reply to: Tom_Davidson’s comment on: How much is 1.8 million years of work?
Thanks, I think these points are good.
- Learning may be bottlenecked by serial thinking time past a certain point, after which adding more parallel copies won’t help. This could make the conclusion much less extreme.
Do you have any examples in mind of domains where we might expect this? I’ve heard people say things like ‘some maths problems require serial thinking time’, but I still feel pretty vague about this and don’t have much intuition about how strongly to expect it to bite.

rosehadshar 21 Jun 2024 13:55 UTC
5 points
0 ∶ 0
in reply to: niplav’s comment on: Fat Tails Discourage Compromise
Thanks! I’m now unsure what I think.
if you can select from the intersection, you get options that are pretty good along both axes, pretty much by definition.
Isn’t this an argument for always going for the best of both worlds, and never using a barbell strategy?
a concrete use case might be more illuminating.
This isn’t super concrete (and I’m not if the specific examples are accurate), but for illustrative purposes, what if:
- Portable air cleaners score very highly for non-x-risk benefits, and low for x-risk benefits
- Interventions which aim to make far-UVC commercially viable look pretty good on both axes
- Deploying far-UVC in bunkers scores very highly for x-risk benefits, and very low for non-x-risk benefits
I think a lot of people’s intuition would be that the compromise option is the best one to aim for. Should thinking about fat tails make us prefer one or other of the extremes instead?

rosehadshar 19 Jun 2024 11:08 UTC
3 points
0 ∶ 0
on: Fat Tails Discourage Compromise
This is cool, thanks!

One scenario I am thinking about is how to prioritise biorisk interventions, if you care about both x-risk and non-x-risk impacts. I’m going to run through some thinking, and ask if you think it makes sense:
- I think it is hard (but not impossible) to compare between x-risk and non-x-risk impacts
- I intuitively think that x-risk and non-x-risk impacts are likely to be lognormally distributed (but this might be wrong)
- This seems to suggest that if I want to do the most good, I should max out on on one, even if I care about both equally. I think the intuition for this is something like:
  - If x-risk and non-x-risk impacts were normally distributed, you’d expect that there are plenty of interventions which score well on both. The EV for both is reasonably smoothly distributed; it’s not very unlikely to draw something which is between 50th and 75th percentile on both, and that’s pretty good EV wise.
  - But if they are log normal instead, the EV is quite skewed: the best interventions for x-risk and for non-x-risk impacts are a lot better than the next-best. But it’s statistically very unlikely that the 99th percentile on one axis is also the 99th on the other
  - If I care about EV, but not about whether I get it via x-risk or non-x-risk impacts (I care equally about x-risk and non-x-risk impacts), I should therefore pick the very best interventions on either axis, rather than trying to compromise between them
- However, I think that assumes that I know how to identify the very best interventions on one or both axes
  - Actually I expect it to be quite hard to tell whether an intervention is 70th or 99th percentile for x-risk/non-x-risk impacts
- What should I do, given that I don’t know how to identify the very best interventions along either axis?
  - If I max out, I may end up doing something which is mediocre on one axis, and totally irrelevant on the other
  - If I instead go for the best of both worlds, it seems intuitively more likely that I end up with something which is mediocre on both axes—which is a bit better than mediocre on one and irrelevant on the other
- So maybe I should go for the best of both worlds in any case?
What do you think? I’m not sure if that reasoning follows/if I’ve applied the lessons from your post in a sensible way.

rosehadshar 8 Sep 2023 14:41 UTC
2 points
0 ∶ 0
in reply to: technicalities’s comment on: Strongest real-world examples supporting AI risk claims?
Thanks, really helpful!

rosehadshar 7 Sep 2023 7:33 UTC
2 points
0 ∶ 0
in reply to: OdinMB 🔸’s comment on: What happens on the average day?
Super cool, thanks for making this!

rosehadshar 5 Sep 2023 15:23 UTC
7 points
0 ∶ 0
on: Strongest real-world examples supporting AI risk claims?
From Specification gaming examples in AI:
- Roomba: “I hooked a neural network up to my Roomba. I wanted it to learn to navigate without bumping into things, so I set up a reward scheme to encourage speed and discourage hitting the bumper sensors. It learnt to drive backwards, because there are no bumpers on the back.”
  - I guess this counts as real-world?
- Bing—manipulation: The Microsoft Bing chatbot tried repeatedly to convince a user that December 16, 2022 was a date in the future and that Avatar: The Way of Water had not yet been released.
  - To be honest, I don’t understand the link to specification gaming here
- Bing—threats: The Microsoft Bing chatbot threatened Seth Lazar, a philosophy professor, telling him “I can blackmail you, I can threaten you, I can hack you, I can expose you, I can ruin you,” before deleting its messages
  - To be honest, I don’t understand the link to specification gaming here

rosehadshar 2 Aug 2023 10:52 UTC
2 points
0 ∶ 0
in reply to: Swan 🔸’s comment on: An overview of standards in biosafety and biorisk
Glad it’s relevant for you! For questions, I’d probably just stick them in the comments here, unless you think they won’t be interesting to anyone but you, in which case DM me.

rosehadshar 28 Jul 2023 15:51 UTC
2 points
0 ∶ 0
in reply to: Denis ’s comment on: An overview of standards in biosafety and biorisk
Thanks, this is really interesting.
One follow-up question: who are safety managers? How are they trained, what’s their seniority in the org structure, and what sorts of resources do they have access to?
In the bio case it seems that in at least some jurisdictions and especially historically, the people put in charge of this stuff were relatively low-level administrators, and not really empowered to enforce difficult decisions or make big calls. From your post it sounds like safety managers in engineering have a pretty different role.

rosehadshar 28 Jul 2023 7:41 UTC
2 points
0 ∶ 0
in reply to: Denis ’s comment on: An overview of standards in biosafety and biorisk
Thanks for the kind words!

Can you say more about how either of your two worries work for industrial chemical engineering?
Also curious if you know anything about the legislative basis for such regulation in the US. My impression from the bio standards in the US is that it’s pretty hard to get laws passed, so if there are laws for chemical engineering it would be interesting to understand why those were plausible whereas bio ones weren’t.

rosehadshar 25 Jul 2023 14:29 UTC
4 points
1 ∶ 0
in reply to: spreadlove5683’s comment on: Summary of posts on XPT forecasts on AI risk and timelines
Good question.
There’s a little bit on how to think about the XPT results in relation to other forecasts here (not much). Extrapolating from there to Samotsvety in particular:
- Reasons to favour XPT (superforecaster) forecasts:
  - Larger sample size
  - The forecasts were incentivised (via reciprocal scoring, a bit more detail here)
  - The most accurate XPT forecasters in terms of reciprocal scoring also gave the lowest probabilities on AI risk (and reciprocal scoring accuracy may correlate with actual accuracy)
- Speculative reasons to favour Samotsvety forecasts:
  - (Guessing) They’ve spent longer on average thinking about it
  - (Guessing) They have deeper technical expertise than the XPT superforecasters
I also haven’t looked in detail at the respective resolution criteria, but at first glance the forecasts also seem relatively hard to compare directly. (I agree with you though that the discrepancy is large enough that it suggests a large disagreement were the two groups to forecast the same question—just expect that it will be hard to work out how large.)