Carlos Ramírez comments on Open thread: April—June 2025

Carlos Ramírez 14 Apr 2025 15:15 UTC
3 points
0 ∶ 0
Since AI X-risk is a main cause area for EA, shouldn’t significant money be going into mechanistic interpretability? After reading the AI 2027 forecast, the opacity of AIs appears to be the main source of risk coming from them. Making significant progress in this field seems very important for alignment.
I took the Giving What We Can Pledge, I want to say there should be something like it but for mechanistic interpretability, but probably only very few people could be convinced to give 10% of their income to mechanistic interpretability.
- sgracer9728 24 Jun 2025 15:24 UTC
  3 points
  0 ∶ 0
  Parent
  what do you mean by the opacity of AIs appearing to be the main source of risk?
- Sean🔸 16 Apr 2025 16:55 UTC
  2 points
  0 ∶ 0
  Parent
  I’ve had similar considerations. Manifund has projects you can fund directly, some of which are about interpretability. Though without specialized knowledge, I find it difficult to trust my judgement more than people whose job it is to research and think strategically about marginal impact.