This post describes how Buck’s cause prioritization within an effective altruism framework leads him to work on AI risk. The case can be broken down into a conjunction of five cruxes. Specifically, the story for impact is that 1) AGI would be a big deal if it were created, 2) has a decent chance of being created soon, before any other “big deal” technology is created, and 3) poses an alignment problem that we both **can** and **need to** think ahead in order to solve. His research 4) would be put into practice if it solved the problem and 5) makes progress on solving the problem.
Planned opinion:
I enjoyed this post, and recommend reading it in full if you are interested in AI risk because of effective altruism. (I’ve kept the summary relatively short because not all of my readers care about effective altruism.) My personal cruxes and story of impact are actually fairly different: in particular, while this post sees the impact of research as coming from solving the technical alignment problem, I care about other sources of impact as well. See this comment for details.
One thing this leaves implicit is the counterfactual: in particular, I thought the point of the “Problems solve themselves” section was that if problems would be solved by default, then you can’t do good by thinking ahead. I wanted to make that clearer, which led to
we both **can** and **need to** think ahead in order to solve [the alignment problem].
Where “can” talks about feasibility, and “need to” talks about the counterfactual.
I can remove the “and **need to**” if you think this is wrong.
I’d prefer something like the weaker and less clear statement “we **can** think ahead, and it’s potentially valuable to do so even given the fact that people might try to figure this all out later”.
Planned summary for the Alignment Newsletter:
Planned opinion:
I think your summary of crux three is slightly wrong: I didn’t say that we need to think about it ahead of time, I just said that we can.
My interpretation was that the crux was
One thing this leaves implicit is the counterfactual: in particular, I thought the point of the “Problems solve themselves” section was that if problems would be solved by default, then you can’t do good by thinking ahead. I wanted to make that clearer, which led to
Where “can” talks about feasibility, and “need to” talks about the counterfactual.
I can remove the “and **need to**” if you think this is wrong.
I’d prefer something like the weaker and less clear statement “we **can** think ahead, and it’s potentially valuable to do so even given the fact that people might try to figure this all out later”.