Actually, I’m wondering why the paying tax branch is attached to the intent alignment and not to the root where “making AI go well” is. The alignment tax is the difference between aligned AI and competent AI but the aligned AI here is in the sense of good outcomes not in the sense of AI tries to do what we want because it seems to include the robustness, reliability and so on, right? I mean agreements, coordination and so on, which are under the pay alignment tax, care about that AI is actually robust, reliable, i.e. it won’t, for example, insert a backdoor in a code generated by AI assistant.
Actually, I’m wondering why the paying tax branch is attached to the intent alignment and not to the root where “making AI go well” is. The alignment tax is the difference between aligned AI and competent AI but the aligned AI here is in the sense of good outcomes not in the sense of AI tries to do what we want because it seems to include the robustness, reliability and so on, right? I mean agreements, coordination and so on, which are under the pay alignment tax, care about that AI is actually robust, reliable, i.e. it won’t, for example, insert a backdoor in a code generated by AI assistant.