As someone in the intersection of these subjects I tend to agree with your conclusion, and with your next comment to Arden describing the design-implementation relationship.
Edit 19 Feb 2022: I want to clarify my position, namely, that I don’t see formal verification as a promising career path. As for what I write below, I both don’t believe it is a very practical suggestions, and I am not at all sold on AI safety.
However, while thinking about this, I did come up with a (very rough) idea for AI alignment , where formal verification could play a significant role. One scenario for AGI takeoff, or for solving AI alignment, is to do it inductively—that is, each generation of agents designs the next generation, which should be more sophisticated (and hopefully still aligned). Perhaps one plan to do achieve this is as follows (I’m not claiming that any step is easy or even plausible):
Formally define what it means for an agent to be aligned, in such a way that subsequent agents designed by this agent are also aligned.
Build your first generation of AI agents (which should be lean and simple as possible, to make the next step easier).
Let a (perhaps computer assisted) human prove that the first generation of AI is aligned in the formal sense of 1.
Then, once you deploy the first generation of agents, it is their job to formally prove that further agents designed by them are aligned as well. Hopefully, since they are very intelligent, and plausibly good at manipulating the previous formal proofs, they can find such proofs. Since the proof is formal, humans can trust and verify it (for example using traditional formal proof checkers), despite not being able to come up with the proof themselves.
This plan has many pitfalls (for example, each step may turn out to be extremely hard to carry out, or maybe your definition of alignment will be so strict that the agents won’t be able to construct any new and interesting aligned agents), however it is a possible way to be certain about having aligned AI.
As someone in the intersection of these subjects I tend to agree with your conclusion, and with your next comment to Arden describing the design-implementation relationship.
Edit 19 Feb 2022: I want to clarify my position, namely, that I don’t see formal verification as a promising career path. As for what I write below, I both don’t believe it is a very practical suggestions, and I am not at all sold on AI safety.
However, while thinking about this, I did come up with a (very rough) idea for AI alignment , where formal verification could play a significant role.
One scenario for AGI takeoff, or for solving AI alignment, is to do it inductively—that is, each generation of agents designs the next generation, which should be more sophisticated (and hopefully still aligned). Perhaps one plan to do achieve this is as follows (I’m not claiming that any step is easy or even plausible):
Formally define what it means for an agent to be aligned, in such a way that subsequent agents designed by this agent are also aligned.
Build your first generation of AI agents (which should be lean and simple as possible, to make the next step easier).
Let a (perhaps computer assisted) human prove that the first generation of AI is aligned in the formal sense of 1.
Then, once you deploy the first generation of agents, it is their job to formally prove that further agents designed by them are aligned as well. Hopefully, since they are very intelligent, and plausibly good at manipulating the previous formal proofs, they can find such proofs. Since the proof is formal, humans can trust and verify it (for example using traditional formal proof checkers), despite not being able to come up with the proof themselves.
This plan has many pitfalls (for example, each step may turn out to be extremely hard to carry out, or maybe your definition of alignment will be so strict that the agents won’t be able to construct any new and interesting aligned agents), however it is a possible way to be certain about having aligned AI.