A key crux here seems to be your claim that AI’s will attempt these plans before they have the relevant capacities because they are on short time scales. However, given enough time and patience, it seems clear to me that the AI could succeed simply by not taking risky actions that it knows it might mess up on until it self improves to be able to take those actions. The question then becomes how long the AI think it has until another AI that could dominate it is built, as well as how fast self improvement is.
A key crux here seems to be your claim that AI’s will attempt these plans before they have the relevant capacities because they are on short time scales. However, given enough time and patience, it seems clear to me that the AI could succeed simply by not taking risky actions that it knows it might mess up on until it self improves to be able to take those actions. The question then becomes how long the AI think it has until another AI that could dominate it is built, as well as how fast self improvement is.