Some considerations I came to think about which might prevent AI systems from becoming power-seeking by default:
Seeking power implies a time delay on the thing it’s actually trying to do, which could be against its preferences for various reasons.
The longer the time-frame, the more complexity and uncertainty will be added, like “how to gain power”, “will this help further the actual goal” etc.
So even if AI systems make plans / chose actions based on expected value calculations, just doing the thing they are trying to do might be the better strategy. (Even if gaining more power first would, if it worked, eventually make the AI system better achieve its goal).
Am I missing something? And are there any predictions on which of these two trends will win out? (I’m speaking of cases where we did not intend the system to be power-seeking, as opposed to, e.g., when you program the system to “make as much money as possible, forever”.)
This was a really interesting decision, thanks for highlighting it here! In the meantime it was followed by other courts with similar cases in the Netherlands and Australia.
Also, for people interested in a legal analysis of the decision, together with my colleague Renan Araújo from the Legal Priorities Project I wrote a blog post on the decision for a UK blog.