You can essentially think of it as two separate problems:
Problem 1: Conditional on us having a technical solution to AI alignment, how do we ensure the first AGI built implements it?
Problem 2: Conditional on us having a technical solution to AI alignment, how do we ensure no AGI is ever built that does NOT implement it, or some other equivalent solution?
I feel like you are talking about Problem 1, and Locke is talking about Problem 2. I agree with the MIRI-type that Problem 1 is easy to solve, and the hard part of that problem is having the solution. I do believe the existing labs working on AGI would implement a solution to AI alignment if we had one. That still leaves Problem 2 that needs to be solved—though at least if we’re facing Problem 2, we do have an aligned AGI to help with the problem.
Hmm. I don’t have strong views on unipolar vs multipolar outcomes, but I think MIRI-type thinks Problem 2 is also easy to solve, due to the last couple clauses of your comment.
You can essentially think of it as two separate problems:
Problem 1: Conditional on us having a technical solution to AI alignment, how do we ensure the first AGI built implements it?
Problem 2: Conditional on us having a technical solution to AI alignment, how do we ensure no AGI is ever built that does NOT implement it, or some other equivalent solution?
I feel like you are talking about Problem 1, and Locke is talking about Problem 2. I agree with the MIRI-type that Problem 1 is easy to solve, and the hard part of that problem is having the solution. I do believe the existing labs working on AGI would implement a solution to AI alignment if we had one. That still leaves Problem 2 that needs to be solved—though at least if we’re facing Problem 2, we do have an aligned AGI to help with the problem.
Hmm. I don’t have strong views on unipolar vs multipolar outcomes, but I think MIRI-type thinks Problem 2 is also easy to solve, due to the last couple clauses of your comment.