Thanks for the comment. I wasn’t aware of yours and Rohin’s discussion on Arden’s post. Did you flesh out the inductive alignment idea on lw or alignment forum? It seems really promising to me.
I want to jot down notes more substantive than “wait until I post ‘Going Long on FV’ in a few months” today.
FV in AI Safety in particular
As Rohin’s comment suggests, both aiming proofs about properties of models toward today’s type theories and aiming tomorrow’s type theories toward ML have two classes of obstacles: 1. is it possible? 2. can it be made competitive?
I’ve gathered that there’s a lot of pessimism about 1, in spite of MIRI’s investment in type theory and in spite of the word “provably” in CHAI’s charter. My personal expected path to impact as it concerns 1. is “wait until theorists smarter than me figure it out”, and I want to position myself to worry about 2..
I think there’s a distinction between theories and products, and I think programmers need to be prepared to commercialize results. There’s a fundamental question: should we expect that a theory’s competitiveness can be improved one or more orders of magnitude by engineering effort, or will engineering effort only provide improvements of less than an order of magnitude? I think a lot depends on how you feel about this.
Asya:
While I agree that proof assistants right now are much slower than doing math proofs yourself, verification is a pretty immature field. I can imagine them becoming a lot better such that they do actually become better to use than doing math proofs yourself, and don’t think this would be the worst thing to invest in.
Asya may not have been speaking about AI safety here, but my basic thinking is that if less primitive proof assistants end updrastically more competitive, and at the same time there are opportunities convert results in verified ML into tooling, expertise in this area could gain a lot of leverage.
FV in other paths to impact
Rohin:
it is plausibly still worthwhile becoming an expert on formal verification because of the potential applications to cybersecurity. (Though it seems like in that case you should just become an expert on cybersecurity.)
It’s not clear to me that grinding FV directly is as wise as, say, CompTIA certifications. From the expectation that FV pays dividends in advanced cybersec, we cannot conclude that FV is relevant to early stages of a cybersec path.
I was speaking about AI safety! To clarify, I meant that investments in formal verification work could in part be used to develop those less primitive proof assistants.
Thanks for the comment. I wasn’t aware of yours and Rohin’s discussion on Arden’s post. Did you flesh out the inductive alignment idea on lw or alignment forum? It seems really promising to me.
I want to jot down notes more substantive than “wait until I post ‘Going Long on FV’ in a few months” today.
FV in AI Safety in particular
As Rohin’s comment suggests, both aiming proofs about properties of models toward today’s type theories and aiming tomorrow’s type theories toward ML have two classes of obstacles: 1. is it possible? 2. can it be made competitive?
I’ve gathered that there’s a lot of pessimism about 1, in spite of MIRI’s investment in type theory and in spite of the word “provably” in CHAI’s charter. My personal expected path to impact as it concerns 1. is “wait until theorists smarter than me figure it out”, and I want to position myself to worry about 2..
I think there’s a distinction between theories and products, and I think programmers need to be prepared to commercialize results. There’s a fundamental question: should we expect that a theory’s competitiveness can be improved one or more orders of magnitude by engineering effort, or will engineering effort only provide improvements of less than an order of magnitude? I think a lot depends on how you feel about this.
Asya:
Asya may not have been speaking about AI safety here, but my basic thinking is that if less primitive proof assistants end up drastically more competitive, and at the same time there are opportunities convert results in verified ML into tooling, expertise in this area could gain a lot of leverage.
FV in other paths to impact
Rohin:
It’s not clear to me that grinding FV directly is as wise as, say, CompTIA certifications. From the expectation that FV pays dividends in advanced cybersec, we cannot conclude that FV is relevant to early stages of a cybersec path.
Related: Information security careers for GCR reduction. I think the software safety standards in a wide variety of fields have a lot of leverage over outcomes.
I was speaking about AI safety! To clarify, I meant that investments in formal verification work could in part be used to develop those less primitive proof assistants.