ScottWofford comments on Why AI Safety Needs a Centralized Plan—And What It Might Look Like

ScottWofford 1 Jun 2025 16:43 UTC
2 points
0 ∶ 0
Really liked this, Brandon. When I started getting into AI safety last year, I had the same impression: not enough coordination. This was one of the reasons I helped build v2 of the map (https://www.aisafety.com/map), so I’m glad it was helpful as a starting point.
Several thoughts:
- Agree—control is neglected: I agree with your potentially hot take that control research looks oddly neglected relative to alignment. If alignment breakthroughs still require robust control to prevent bad actors or accidents, the resource balance seems off, which is why I’m working on control (https://luthienresearch.org/). Perhaps developing a comprehensive, centralized and high-level might make this the consensus view and lead to resource re-allocations.
- ai-plans.com: I just looked and there’s only one plan above zero karma. To me that’s evidence for your claim that the community offers great critiques but little replacement—so either improvements to the ai-plans.com user interface or, as you propose, a living plan document (with a “patch or provide-better” norm) could be useful.
- Theories of Change: I recently read a theory of change post (https://forum.effectivealtruism.org/posts/9t7St3pfEEiDsQ2Tr/nailing-the-basics-theories-of-change) and found it instructive. I wonder if a bunch of similarly-formatted theory of change diagrams could ladder up into the high-level visualization you describe. Having all theories of change compiled in one place could also help to develop some metrics to measure how each org and the community are making progress across key metris in each of the theory of change “columns” (inputs, outputs, impact).
As a next step, perhaps a call with with Kabir could be valuable to get his take on pain points and opportunities from his experiencing facilitating ai-plans.com discussions (I’d be happy to join as an optional attendee). Then, I think it would be straightforward to mock-up the user interface you describe with AI tools (e.g., Cursor, Loveable) to make it real and get some early feedback.
- Brandon Riggs 9 Jun 2025 12:12 UTC
  1 point
  0 ∶ 0
  Parent
  Thanks Scott! And for your work on the ai safety map—was a great surprise to find out you had helped with that!
  
  Agree with you on your points, especially around theories of change and the post you shared. I feel like the highest amount of value for least amount of work an org can produce is ensuring that the work they’re doing is valuable/impactful in the first place. Without an explicit theory of change, or seeing how their org/ToC fits into the “larger picture”, well-intentioned people can be stuck spinning their wheels. Without a centralized plan (the larger picture made explicit), I think your proposal of compiling organization’s ToCs could be a great place to start.
  
  I’ve booked in a call with Kabir and will definitely loop you in depending on how that goes!
- [ ]
  [deleted]