Just flagging this post which has some advice how to get “from zero to hero”: Levelling Up in AI Safety Research Engineering. According to this framework, I went to Level 3.
FangFang
Hi Henry :) Thanks a lot for your kind words—and for sharing your thoughts and resources on the topic! I am very grateful you’ve commented on the post as someone with a technological background. Will definitely have a look at them myself as well.
RE maths: I think I do understand the basics. We had pretty much of that at highschool and the statistics courses included a lot of mathematics as well (especially probabilities). So I agree that you probably need some knowledge here, but maybe this is the reason why I didn’t need to go deeper(?)
This was a really interesting decision, thanks for highlighting it here! In the meantime it was followed by other courts with similar cases in the Netherlands and Australia.
Also, for people interested in a legal analysis of the decision, together with my colleague Renan Araújo from the Legal Priorities Project I wrote a blog post on the decision for a UK blog.
Some considerations I came to think about which might prevent AI systems from becoming power-seeking by default:
Seeking power implies a time delay on the thing it’s actually trying to do, which could be against its preferences for various reasons.
The longer the time-frame, the more complexity and uncertainty will be added, like “how to gain power”, “will this help further the actual goal” etc.
So even if AI systems make plans / chose actions based on expected value calculations, just doing the thing they are trying to do might be the better strategy. (Even if gaining more power first would, if it worked, eventually make the AI system better achieve its goal).
Am I missing something? And are there any predictions on which of these two trends will win out? (I’m speaking of cases where we did not intend the system to be power-seeking, as opposed to, e.g., when you program the system to “make as much money as possible, forever”.)