Some considerations I came to think about which might prevent AI systems from becoming power-seeking by default:
Seeking power implies a time delay on the thing it’s actually trying to do, which could be against its preferences for various reasons.
The longer the time-frame, the more complexity and uncertainty will be added, like “how to gain power”, “will this help further the actual goal” etc.
So even if AI systems make plans / chose actions based on expected value calculations, just doing the thing they are trying to do might be the better strategy. (Even if gaining more power first would, if it worked, eventually make the AI system better achieve its goal).
Am I missing something? And are there any predictions on which of these two trends will win out? (I’m speaking of cases where we did not intend the system to be power-seeking, as opposed to, e.g., when you program the system to “make as much money as possible, forever”.)
Some considerations I came to think about which might prevent AI systems from becoming power-seeking by default:
Seeking power implies a time delay on the thing it’s actually trying to do, which could be against its preferences for various reasons.
The longer the time-frame, the more complexity and uncertainty will be added, like “how to gain power”, “will this help further the actual goal” etc.
So even if AI systems make plans / chose actions based on expected value calculations, just doing the thing they are trying to do might be the better strategy. (Even if gaining more power first would, if it worked, eventually make the AI system better achieve its goal).
Am I missing something? And are there any predictions on which of these two trends will win out? (I’m speaking of cases where we did not intend the system to be power-seeking, as opposed to, e.g., when you program the system to “make as much money as possible, forever”.)
I asked a similar question in the LW thread: https://www.lesswrong.com/posts/SFuLQA7guCnG8pQ7T/all-agi-safety-questions-welcome-especially-basic-ones-may?commentId=La8GtcDSKASbgvG2J